I am using the following 22MB eng.traineddata in my app and it is working very well https://github.com/tesseract-ocr/tessdata/blob/main/eng.traineddata
There were some corner cases I thought I'd be able to train the model with https://github.com/TheBookOfMormon/TheCompleteBookOfMormon/tree/master/Data/Sources/1830PalmyraEdition/03-OCRTraining/1830PalmyraEdition I tried training this 22 MB file but it won't work because it is the integer version of the model. I then tried this 15 MB file from tessdata_best https://github.com/tesseract-ocr/tessdata_best/blob/main/eng.traineddata It's a year older, and the results it produces aren't as good after I've trained it as the 22 MB file. In fact, even using this 15MB "best" file without training gives me results that are not as good as the 22MB file. Where can I get the trainable version of the 22 MB file? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/8d1954b1-3449-4662-b6fa-6d59203b9db6n%40googlegroups.com.

