Dear All,
I’m looking for advice because I am stuck. I’m training Tesseract to do
optical character recognition of texts in Lushootseed, an Indigenous
language of Washington State with no living speakers. The language has some
special characters and many diacritics, and I do not know what the
I have uploaded the results of various trainings for IAST (with diacritics)
and Devanagari for Sanskrit at
https://github.com/Shreeshrii/tess5training-sanskrit-iast/tree/main/tessdata/best
. The traineddata files and the corresponding lstm-unicharset has been
uploaded there.
The training has been
Hello,
I would like to train *Tesseract 4* to recognize certain scripts/languages
based on real images rather than synthetic ones. Here are my questions:
1. Is there a tool, preferably cross-platform (Windows/Linux) GUI, that
assists in creating .box file based on scanned images? How to get
co
3 matches
Mail list logo