Hello Shree,

I just uploaded new version of the package. About the fixes:

1. --fonts_dir: I added the default value of the fonts directory on 
different platforms.

2. Amount of threads: I also capped the maximal amount of threads to the 
number of CPUs.

Would you like to re-test it, please?



Am Dienstag, 4. Februar 2020 12:21:49 UTC+1 schrieb shree:
>
> By the way, I added a create_ground_truth utility, which creates .gt.txt 
>> files as well as the associated .tif files for every specified font, to 
>> the package. I think it could be useful for anyone who does not have a 
>> ground truth collection yet.
>>
>> Thanks, I tried it with latest tesseract code.
>
> 1. Error when --fonts_dir is not specified, works ok, when specified.
>
> 2. Very slow (10 mins), started 20 text2image processes in parallel for 
> training_text with 20 lines.
>
>  create_ground_truth --fonts_dir ~/.fonts --fonts "Arial Unicode MS" 
> corpora ground-truth
> 2020-02-04 11:01:19,135 INFO     Processing .txt files
> 2020-02-04 11:01:19,137 INFO     Generating .tif files
> 2020-02-04 11:10:24,855 INFO     Done
>
> Much faster (1 second) after setting  export OMP_THREAD_LIMIT=1
>
>  export OMP_THREAD_LIMIT=1
>  create_ground_truth --fonts_dir ~/.fonts --fonts "Arial Unicode MS" 
> corpora ground-truth
> 2020-02-04 11:12:18,713 INFO     Processing .txt files
> 2020-02-04 11:12:18,715 INFO     Generating .tif files
> 2020-02-04 11:12:19,398 INFO     Done
>
> You can update the documenation.
>
> <http://bhajans.ramparivar.com>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ec83d722-4bac-46cf-b501-d4d990816596%40googlegroups.com.

Reply via email to