You are correct, I did miss that section. Inverting the image seems to 
produce better results.

I think the fact that the images are simple and that the resulting text was 
not even close had me in the mindset that it wasn't a quality problem as 
much as an option I was missing somehow, so I was looking for something 
like that.

Anyway, thank you for pointing it out.

On Sunday, August 10, 2025 at 3:44:50 PM UTC-4 zdenop wrote:

> Seems like you miss this 
> https://github.com/tesseract-ocr/tessdoc/blob/main/ImproveQuality.md...
>
> Zdenko
>
>
> ne 10. 8. 2025 o 21:16 Thomas McGrew <[email protected]> napísal(a):
>
>> I had looked through that some, but I looked again and I don't see 
>> anything in the documentation that addresses this problem. Is there 
>> something in particular in the documentation that I should read?
>>
>> I know how to install the application, as I have already done so, I know 
>> how to run it - I'm mostly using it via pyocr, but the command line gives 
>> the same result. I have the models for OSD, English and Japanese installed. 
>> I have run tesseract on thousands of images like this, and 99% of the time 
>> it works fine.
>>
>> If the model sometimes hallucinates and there is nothing to be done then 
>> that's fine and just something I'll have to work around. I did find that 
>> scaling an image to a different size does generally make tesseract read the 
>> text correctly when this happens, for whatever reason.
>> On Sunday, August 10, 2025 at 5:18:53 AM UTC-4 zdenop wrote:
>>
>>> https://github.com/tesseract-ocr/tessdoc
>>>
>>> Zdenko
>>>
>>>
>>> ne 10. 8. 2025 o 10:25 Thomas McGrew <[email protected]> napísal(a):
>>>
>>>> I read the man page and the command line help, unless you're referring 
>>>> to some other documentation, then yes I read it.
>>>>
>>>> Thomas McGrew
>>>>
>>>> On Sun, Aug 10, 2025, 03:41 Thomas McGrew <[email protected]> wrote:
>>>>
>>>>> I'm trying to understand why tesseract is detecting this text 
>>>>> incorrectly.
>>>>>
>>>>> --oem 0 has issues with italics, so I've been using --oem 1, however 
>>>>> on this one image (that I've noticed so far), it seems to be totally 
>>>>> incorrect.
>>>>>
>>>>> The image clearly contains only the text "'Kaay."
>>>>> Yet tesseract reads the text with --oem 1 as "LECEVA"
>>>>> --oem 0 does read the text correctly.
>>>>>
>>>>> I'm using the default psm of 3, but no others I have tried seem to 
>>>>> read the text correctly.
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to a topic in the 
>>>>> Google Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this topic, visit 
>>>>> https://groups.google.com/d/topic/tesseract-ocr/TRLTSbSg_30/unsubscribe
>>>>> .
>>>>> To unsubscribe from this group and all its topics, send an email to 
>>>>> [email protected].
>>>>> To view this discussion visit 
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/017ef73f-c695-4a06-819a-9f2b46ab3e89n%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/017ef73f-c695-4a06-819a-9f2b46ab3e89n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>>>
>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>>
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAM3xfkfm0iXN_ZAmdu84vqEuwQ1a3GF6wwGd5wL-AiMNPONUTg%40mail.gmail.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAM3xfkfm0iXN_ZAmdu84vqEuwQ1a3GF6wwGd5wL-AiMNPONUTg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/f3d99941-39ed-499c-8bd1-ad79d437c959n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/f3d99941-39ed-499c-8bd1-ad79d437c959n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/718b7ed7-0298-4c6e-8a59-d101d9d7221cn%40googlegroups.com.

Reply via email to