>
> By the way, I added a create_ground_truth utility, which creates .gt.txt
> files as well as the associated .tif files for every specified font, to
> the package. I think it could be useful for anyone who does not have a
> ground truth collection yet.
>
> Thanks, I tried it with latest tesseract code.

1. Error when --fonts_dir is not specified, works ok, when specified.

2. Very slow (10 mins), started 20 text2image processes in parallel for
training_text with 20 lines.

 create_ground_truth --fonts_dir ~/.fonts --fonts "Arial Unicode MS"
corpora ground-truth
2020-02-04 11:01:19,135 INFO     Processing .txt files
2020-02-04 11:01:19,137 INFO     Generating .tif files
2020-02-04 11:10:24,855 INFO     Done

Much faster (1 second) after setting  export OMP_THREAD_LIMIT=1

 export OMP_THREAD_LIMIT=1
 create_ground_truth --fonts_dir ~/.fonts --fonts "Arial Unicode MS"
corpora ground-truth
2020-02-04 11:12:18,713 INFO     Processing .txt files
2020-02-04 11:12:18,715 INFO     Generating .tif files
2020-02-04 11:12:19,398 INFO     Done

You can update the documenation.

<http://bhajans.ramparivar.com>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXzXLFbK8JKnNOK%3Di39p3UcGZJgJSmvzCbmUo_rnwhpRQ%40mail.gmail.com.

Reply via email to