[tesseract-ocr] Train Tesseract 5 german for new font

testcoal Sun, 12 May 2024 11:53:02 -0700

Hi,

I wanted to reach out regarding my recent attempt to train Tesseract 5 for 
a new font, specifically in German. I followed a tutorial I found on 
YouTube: https://www.youtube.com/watch?v=KE4xEzFGSU8) and initially had 
success when training it for English. However, upon transitioning to 
German, I encountered an error that I'm struggling to resolve.


The issue arises with the file data/deu/Apex.lstm-unicharset, which appears 
to be missing. In langdata, I've confirmed that the file deu.unicharset 
exists and is correct; all German characters are present as expected. 
However, upon further inspection, I noticed discrepancies in the file 
data/Apex/my.unicharset. Not all characters from the all-gt dataset seem to 
be included.

I've reviewed the process and ensured that all steps were followed 
accurately, but I'm still encountering this error. 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9689febe-6823-4498-a907-e9ee30c93788n%40googlegroups.com.

[tesseract-ocr] Train Tesseract 5 german for new font

Reply via email to