@Shree Thanks for the tip. Just 2 quick questions. 1) From https://github.com/tesseract-ocr/tesseract/wiki/Data-Files, it says that "osd" and "equ" traineddata files are compatible between Tesseract 3 and 4. In the GitHub tessdata_fast repo (https://github.com/tesseract-ocr/tessdata_fast), "osd" is there with the commit "Use legacy Orientation Script Detector (OSD) because that is the only thing that currently works." However, "equ" is not in the repo. Was this simply a small mistake where the maintainer forgot to include the "equ" data file?
2) Also, with tessdata_fast, I was able to get Tesseract 4 running faster than using Tesseract 4 with tessdata. However, is Tesseract 4 supposed to be slower than Tesseract 3 because that's what I'm experiencing? # Here are the updated instructions to download tessdata_fast, which I tested to indeed perform faster than tessdata. # However, when calling Tesseract from the command line, using the arguments "--oem 2" will no longer work. # Use "--oem 1" since only the neural net LSTM model exists if using tessdata_fast. wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/osd.traineddata?raw=true wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/eng.traineddata?raw=true wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/chi_sim.traineddata?raw=true -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/47f3b497-84fb-4aed-9766-877053e8a293%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

