> > By the way, I added a create_ground_truth utility, which creates .gt.txt > files as well as the associated .tif files for every specified font, to > the package. I think it could be useful for anyone who does not have a > ground truth collection yet. > > Thanks, I tried it with latest tesseract code.
1. Error when --fonts_dir is not specified, works ok, when specified. 2. Very slow (10 mins), started 20 text2image processes in parallel for training_text with 20 lines. create_ground_truth --fonts_dir ~/.fonts --fonts "Arial Unicode MS" corpora ground-truth 2020-02-04 11:01:19,135 INFO Processing .txt files 2020-02-04 11:01:19,137 INFO Generating .tif files 2020-02-04 11:10:24,855 INFO Done Much faster (1 second) after setting export OMP_THREAD_LIMIT=1 export OMP_THREAD_LIMIT=1 create_ground_truth --fonts_dir ~/.fonts --fonts "Arial Unicode MS" corpora ground-truth 2020-02-04 11:12:18,713 INFO Processing .txt files 2020-02-04 11:12:18,715 INFO Generating .tif files 2020-02-04 11:12:19,398 INFO Done You can update the documenation. <http://bhajans.ramparivar.com> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXzXLFbK8JKnNOK%3Di39p3UcGZJgJSmvzCbmUo_rnwhpRQ%40mail.gmail.com.