Why OCR Systems Break in Real-world Logistics

Sushanth Raman, CEO and founder

5min read

Before CoPallet and AI, I got my start in logistics building an OCR system for logistics documents. The vision was compelling: automate the work of reading PDFs and emails, save costs, boost accuracy. At the time, every TMS was building an OCR add-on and hundreds of companies offered standalone solutions. Executives were pouring money into automation.

Yet OCR systems have largely failed to deliver meaningful automation. Why? In retrospect, it turned out that OCR technology was fundamentally flawed for what we were trying to accomplish.

How Does OCR Work?

Traditional OCR extracts and classifies information using pre-programmed, templated rules. It assumes data appears in specific locations and follows patterns:

City of origin is on the 3rd line in the top-left address block
PO number is found after after "PO#"
Cargo width is the number under “Width” the table of a BOL.

Data gets extracted based on these rules and then sent to your TMS or WMS. Simple, right? Wrong.

Infinite Document Formats

Every document looks different, every variant required custom configuration. I worked with a mid-sized carrier that needed 500 BOL templates. Here are just a few examples of the configuration we needed to do:

Street addresses are sometimes one line, sometimes two
Inconsistent wording: "Phone" vs "Ph", BOL# vs B/L#
Tables span multiple pages, but the header is only on page 1
OCR tools would almost never get handwriting correctly

The setup was manual, but the maintenance was what really killed us. Brokers and carriers change their document format all the time. Imagine squinting between two similar PDFs trying to pick the differences. We were playing “Where’s Waldo” while incoming documents were failing left and right.

Thousands of Business Rules

Extracting data is just step one. Making the data play nicely with the TMS was even more of a nightmare. The “final mile” of data delivery is loaded with rules and tribal knowledge.

Names don’t match with what’s in the system. E.g. you could get BOLs from Proctor & Gamble, but it is actually coded as P&G in the TMS. Or 100 Main Street is shorthand for 100 Main Street, Kansas City, MO.
Long-stanging customers are used to unwritten agreements that they expect their rep to know. Some customers will never document that they need a liftgate. Sometimes LTL loads are always class 60 by default.
Some customers even require mental gymnastics before entering the data. One customer wanted LTL loads to be consolidated if there were multiple loads going to the same location in the same week. Another customer wanted to know if any of the commodities contained pork, so they didn't get assigned to Muslim drivers.

Every customer and location had bespoke rules

OCR systems punt this complexity back to users. Rather than processing the document, OCR makes a first draft and leaves it to the users to confirm the results. Checking the work was barely faster than just processing the document manually from scratch.

Prompting Outperforms OCR Pattern Matching

When GPT-4 was released in 2023, I realized it was a matter of time before all these OCR tools would become obsolete. AI’s flexibility made it the perfect technology to interpret open-ended formats and business rules. So we built CoPallet to be the smartest AI agent at reading logistics docs.

Rather than looking for rigid text patterns, we teach CoPallet how to find information with a natural language prompt. What’s a prompt? You can think of it as training that you might give a new employee.

Here’s what the simplified prompt to identify a BOL number looks like:

TASK: Identify and extract the BOL number, applying order_summary prioritization rules if multiple values exist.

IDENTIFICATION:
- Focus on fields explicitly labeled with these variations: "BOL #", "Shippers BOL #", "Bill of Lading", "B.O.L.", "BL", "B/L", "BOL", "b/l", "(b/l) number", "BI/No", "BOL#", "Master B/L", "House B/L", "Waybill No.", "Transport Document No.", "Tracking #", "Reference No.", "Pro Number", or "Freight Bill No."
- Include "Sales Order #" and "Shipper's No." and "Shipment #" when they appear on the same page with explicit reference to a Bill of Lading
- Consider "Master BOL #" or "MBOL#" when primary identifiers are absent

LOCATION PRIORITY:
- Begin with page 1 before examining subsequent pages
- Examine header and footer areas first, then the main body of the document
- Give special attention to the top-right corner, which commonly contains BOL numbers

FORMAT VALIDATION:
- Accept valid formats including: all numeric (e.g., "172976"), alphanumeric with leading letters (e.g., "N04533545"), hyphenated (e.g., "88636-1340354"), or with spaces (e.g., "MSCU 123456789")
- Look for character lengths typically ranging from 6-20 characters
- Include any leading zeros (e.g., "0007243513") by capturing the entire number string
- Include prefix codes like "MBOL", "HBOL", "SCAC" or company identifiers when present

Prompting is a non-trivial task that requires accumulated experience from understanding the range of document formats and how AI models respond. The experience required to teach AI how to work with logistics documents is also part of the reason why industry-specialization is so important.

AI Can Reason Using Custom Rules

When you combine data extraction with CoPallet’s Memory and Reasoning Layer, you can complete shipment processing workflows from end to end with no human touches. CoPallet is able to store all your rules and tribal knowledge in its Memory Layer and retrieve the relevant information to handle each shipment.

CoPallet converts SOPs into retrievable memories

Even more powerful: Reasoning creates a step-wise improvement in handwriting interpretation. For example, if you receive a BOL from P&G with an illegible pickup address that starts says 12* M**** St, Indianap**** IN, CoPallet can confidently guess that Indianap**** is Indianapolis. But what about 12* M**** St? CoPallet can go through P&G’s shipment history and see previous shipments picked up at 127 Maple St. This is the same kind of problem-solving a human operator would perform to maximize accuracy.

Reliability Through Consensus

One of the big concerns I hear is reliability. There are actually two separate questions: What is CoPallet’s accuracy rate? And more importantly, how good is CoPallet at knowing when something might be wrong so an operator can double check?

The way CoPallet handles this is by running each task through multiple different AI models (OpenAI, Anthropic, Gemini, Llama) and comparing the results. When there is consensus amongst different models, it’s a sign that the response is reliable. But if the models disagree on an output, then CoPallet flags the problem to a human supervisor.

The Bottom Line

I’m probably 1 of a 10 people in the world to have built both an OCR and AI system for shipment processing. My verdict is OCR’s rigid pattern-matching can’t handle the messy reality of human-generated documents and rules. AI wins because it specializes in understanding human cognitive patterns.

So don’t be surprised if your OCR system isn’t delivering the level of automation you expected. OCR was not built for this. Get in touch with us if you’re looking to automate the range of real-world daily scenarios your operators encounter.