Hi all, I just want to mention that the copy of tesstrain.sh that ships with Ubuntu is slightly modified to make life a little easier. The very terse documentation is in the standard location.
/usr/share/doc/tesseract/README.debian The modification saves some typing. This is an example of training for Japanese. get clone https://github.com/tesseract-ocr/langdata.git apt-get install fonts-noto-cjk fonts-japanese-mincho.ttf fonts-takao-gothic fonts-vlgothic tesstrain.sh --lang jpn --langdata_dir langdata I apologize, but I don't have time to read all the questions on this thread or provide support to people having trouble. Just wanted folks (especially Nick White) to know that Ubuntu and similar distributions have a few of the default parameters automatically filled out for tesstrain.sh. We can do that because many of the directory locations are standardized. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/055a166b-795d-4402-8996-22c02182b14e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.