Since no one else has replied, I'll offer a couple of suggestions. On Thursday, July 10, 2025 at 8:39:00 AM UTC-4 [email protected] wrote:
I was trying to OCR the text printed on a uniqlo T-shirt: https://www.uniqlo.com/uk/en/products/E480814-000/00 Why? Would it be more cost effective to just have it double/triple-keyed and compare the transcriptions? The source image I used is attached. It looks quite clean to me but I am still facing issues to properly transcribe it. You don't say what pre-processing you did. Did you remove all the orange? Anything else? I am missing the single quote character (') in my whitelist but wasnn't sure how to provide this. That's a basic shell quoting issue that the documentation for your shell should cover. Are there better settings to use for such a use case? Maybe? But there may also be much more efficient ways to crack this particular nut than using OCR. Tom -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/c571a916-056f-4fb9-8462-43d07df5661bn%40googlegroups.com.

