That's odd that you get better results with the German model for English 
text. That might be worth investigating to see if there's something amiss 
with your pre-processing or something else.

On Thursday, May 22, 2025 at 4:36:10 PM UTC-4 [email protected] wrote:


If I run *tesseract title.jpg stdout --psm 7 --oem 1 -l eng+fra+spa+deu* 
it's faster (0,3s) and the title is still correct.
If I run *tesseract title.jpg stdout --psm 7 --oem 1 -l eng+fra+spa+deu* 
it's even faster (0.25) but the title is wrong ("AVEO Segue")


Aren't these two commands the same?
 


So multiple questions here.
- Can tesseract work like a shell? I send a picture, I get the txt. I send 
a picture, I get the text. Without ever closing tesseract?


Using tesserocr, the Python wrapper for the Tesseract API that Zdenko 
pointed to, you have full control of the processing and how you decompose 
it.
 

- Can I get the "confidence" level for each of those predictions? It might 
help to figure which one is the most probable?


Check out the iterator and confidence examples 
here: https://tesseract-ocr.github.io/tessdoc/APIExample.html
 
Tom

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/f6952884-e83f-46b9-8308-1e4bb7504a49n%40googlegroups.com.

Reply via email to