I am using the following 22MB eng.traineddata in my app and it is working 
very well
https://github.com/tesseract-ocr/tessdata/blob/main/eng.traineddata

There were some corner cases I thought I'd be able to train the model with
https://github.com/TheBookOfMormon/TheCompleteBookOfMormon/tree/master/Data/Sources/1830PalmyraEdition/03-OCRTraining/1830PalmyraEdition

I tried training this 22 MB file but it won't work because it is the 
integer version of the model.

I then tried this 15 MB file from tessdata_best
https://github.com/tesseract-ocr/tessdata_best/blob/main/eng.traineddata

It's a year older, and the results it produces aren't as good after I've 
trained it as the 22 MB file. In fact, even using this 15MB "best" file 
without training gives me results that are not as good as the 22MB file.

Where can I get the trainable version of the 22 MB file?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/8d1954b1-3449-4662-b6fa-6d59203b9db6n%40googlegroups.com.

Reply via email to