[tesseract-ocr] Tesseract unstable font property prediction

Kehinde Adeoya Fri, 02 Sep 2022 01:33:50 -0700

Tesseract 3.0.5
TessData 3.0.4
Tesseract 5Java binding.

I am using Tesseract 3.0.5 in a project, which is awesome. It works 
brilliantly well. Lately, I noticed its predictability changes when the 
same code is run multiple times for the same image text. I was able to 
train new fonts in different languages. An example is this: when I run to 
get the font properties of an image, I'm getting these properties: 
font-name, bold, italic, monospace, serif, and underline. I ran it multiple 
times on the same image text, and it produces different results for the 
same image text.


The text on the image should return this result: 
Ubuntu, FALSE, FALSE, FALSE, FALSE, FALSE, PASS, but subsequent runs 
produce different results for the same text on the same image. 

Runs    Font name    Bold    Italic    Monospace    Serif    Underline   
 Result
First run:    Ubuntu    FALSE    FALSE    FALSE    FALSE    FALSE    PASS
Second run:   Ubuntu-Italic    FALSE    TRUE    FALSE    FALSE    FALSE   
 FAIL
Third run:    Ubuntu-Bold    TRUE    FALSE    FALSE    FALSE    FALSE   
 FAIL

Are there settings to make it more resilient and specific than changing it 
at every new run?


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0e49822e-7bde-476f-9fcb-168bad859698n%40googlegroups.com.

[tesseract-ocr] Tesseract unstable font property prediction

Reply via email to