What the customer actually sends
The customer isn't a technical expert. They know the problem — their car isn't braking well, the truck has a strange noise, a part broke — and they send whatever they have on hand. They almost never send the data the catalog needs.
In auto parts, less than 15% of quotes arrive with the right part number (OEM or aftermarket). The other 85% comes in formats that a human rep has to translate into something searchable. That translation is invisible work that nobody measures but everybody pays for.
The bottleneck in quoting isn't the number of quotes. It's that every quote starts with an input that has to be interpreted before it can be quoted.
The 4 unstructured input formats
Four formats cover virtually all quotes arriving via WhatsApp in auto parts. Proportions vary by business type (the corner retailer gets more photos, the wholesaler more VINs and numbers), but the mix always includes all four.
The hidden cost of human translation
The work of translating an unstructured format into a searchable input has three costs that don't show up on the balance sheet:
1. Time: 30-50% of every quote
Adding average times per format (3-8 min for photo, 2-5 for VIN, 4-7 for audio, 3-6 for screenshot), the translation time before being able to quote is 3-7 minutes per request. Over total quoting time (which rarely drops below 10 minutes), that's 30-50%. With 1,500 quotes/month and a 5-minute average, that's 125 hours/month of invisible work — equivalent to 0.7 FTE just interpreting inputs.
2. Errors: half of returns come from bad identification
When the rep guesses wrong from an ambiguous photo or audio, the customer receives the wrong part. That misidentification return is 40-60% of total sector returns. Each return costs logistics, team time, and most expensive, customer trust.
3. Loss: when the rep asks for more info, the customer leaves
If the rep can't identify the part from what the customer sent, they ask for more (another photo, the VIN, make/model). That question consumes the customer's time — and in auto parts, where the customer sent the same request to multiple suppliers in parallel, it's often enough to lose the sale to whoever identified the part on the first try.
Human translation of the input isn't just a capacity bottleneck. It's the main source of identification errors and of quotes lost to friction.
What changes operationally when AI processes it
Replacing human translation with automated processing isn't just "faster". It changes 4 operational metrics at once:
- Mean response time: drops from 10-20 minutes to 30-60 seconds. Input processing takes seconds; the rest is ERP lookup and quote formatting.
- Rate of "I need more info": drops drastically because AI combines multiple signals (photo + text + prior context) instead of processing them separately.
- Identification errors: drop 50-70%, not by magic but because AI cross-checks more data before deciding and asks for confirmation when confidence is low — instead of guessing like a rushed human.
- 24/7 response capacity: processing doesn't require a rep to be available, so nights and weekends are covered just like business hours.
A typical distributor with 1,500 monthly quotes sees, in the first 60 days, response time dropping from ~25 minutes to <1 minute, identification-error returns falling from 6% to 2-3%, and quote-to-order conversion rising 5-12 percentage points.
What AI still doesn't solve well
Technical honesty: there are 4 scenarios where AI fails or requires human intervention, worth knowing before implementing:
1. Unusable photo
When the photo is completely blurry, too dark, or shows something irrelevant (the customer's hand, part of the workshop floor), AI can't infer. It asks for another photo. If the customer can't send one, it escalates to human.
2. Incomplete, mistyped, or nonexistent VIN
VINs prior to 1981 don't follow standard format, rare imported vehicles may not be in databases, and transcription errors are sometimes unrecoverable without a photo of the document. In those cases AI asks for the VIN photo or escalates.
3. Very local or ambiguous slang
Sector vocabulary varies notably across LATAM markets: in Mexico a customer asks for a balero for a Tsuru or a mofle for their Chevy; in Colombia the same customer asks for a rodamiento for a Renault Logan; in Argentina a ruleman for a VW Gol or Renault 12; in Chile the slang is similar to Colombian but Hyundai Accent and Nissan V16 dominate; in Peru the mix combines MX and CO terms with high Asian-brand penetration. "The one above the driveshaft" can be any of several parts depending on the vehicle. AI asks before quoting wrong — but that question is extra friction that in some cases is lost as a sale. Mitigation: train the digital collaborator with vocabulary specific to the market it operates in (an MX collaborator shouldn't assume it knows Argentine lunfardo).
4. Custom or very low-volume vehicles
Modified, grey-import, public service vehicles with special parts — all cases where the standard catalog doesn't apply. They're escalated to humans with all the context already gathered by AI (photo, VIN, description).
In practice, a well-implemented AI handles 70-85% of quotes end-to-end and routes the rest to the human team. The difference from the status quo: when it routes, it routes with full context. The rep doesn't start from scratch.
How to start processing unstructured inputs
Recommended order to incorporate this type of processing:
- Month 1: map the 4 formats in your actual operation (how many quotes arrive in each, average translation time, "I need more info" rate). Without that measurement, you don't know what to optimize first.
- Month 2-3: implement automated VIN processing — the format with fastest ROI because VIN already has a standard and AI can be very precise.
- Month 3-4: add photo processing. Photo is the most common format and where you gain the most in translation time.
- Month 4-6: enable audio and screenshot. Lower volume but high friction when humans process them.
- Ongoing: periodically measure identification errors. The AI's learning curve with your specific catalog shows up in the first 90 days.
Frequently asked questions
What percentage of quotes arrives with an exact part number?
In the auto parts sector across LATAM, less than 15% of quotes arrive with a direct, correct part number (OEM or aftermarket). The rest arrives as a photo (~40%), vehicle VIN (~25%), audio or descriptive text in regional slang (~15%), or screenshot (~5-20% depending on customer type). The human rep has to translate all those formats before being able to quote.
Can AI quote from a blurry or low-quality photo?
Depends on the level of degradation. Current vision AI (multimodal models) works with imperfect photos — dirty, partially mounted, poorly lit — and still identifies the part with reasonable accuracy if there are distinguishing features. When the photo is unreadable, it asks for more information from the customer instead of guessing. That reduces identification-error returns.
How does AI process the vehicle VIN?
The VIN is 17 characters with international standard encoding. AI validates it (length, check digit, prohibited characters) and decodes it against databases (NHTSA + regional decoders for local brands). It retrieves make, model, year, engine, and submodel, and filters the catalog to parts compatible with that specific vehicle.
Does it work with audio in Mexican, Colombian, or Argentine regional slang?
Current models transcribe LATAM accents with good accuracy and handle common regional auto parts slang (balero/rodamiento, chumacera/cojinete, mofle/escape, etc.). When there's a very local or ambiguous term, AI asks for clarification before quoting. Accuracy improves when the digital collaborator is trained with vocabulary specific to the market it operates in.
When do I need human intervention on a quote like this?
Four scenarios: 1) unusable photo the customer can't improve; 2) very custom part or rare imported vehicle outside the standard catalog; 3) special pricing decisions (discount, credit terms); 4) escalation when the customer wants to talk to a person. AI handles 70-85% of quotes end-to-end and routes the rest to the human team with all the context already gathered.
Want to see it quote from a photo?
Live demo with your inputs
30 minutes. We send Victoria a real photo, a VIN, an audio from your day-to-day — and watch how she quotes in real time.
Book demo