[tesseract-ocr] Shord word detection recommendations

Jean-Marc Spaggiari Tue, 02 Apr 2024 05:46:23 -0700

Hi,

I'm trying to OCR short words in the form of a letter, a space, 4 numbers.


I'm doing a lot of pre-processing to get the picture cleaned and so far I 
arrive to something like that:
[image: output6.png]
My challenge is that tesseract is only detecting the numbers. I tried all 
the posisble PSM with the same result. The heading C is always ignored.

This is the command line that I am running:
tesseract -c tessedit_char_whitelist=" 
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" output6.png stdout

I tried with tesseract 5.3.0 and tesseract 5.3.4-45-g87a15 with the same 
result. 

I'm looking for some recommendations on what I can do better to help 
tesseract detecting the heading C correctly.

Thanks,

JMS

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/cf6a3a25-732a-4214-8ce3-03a90a719c8dn%40googlegroups.com.

[tesseract-ocr] Shord word detection recommendations

Reply via email to