Re: [tesseract-ocr] Re: Announcement: Python package pytesstrain (Tesseract training helpers)

2020-02-10 Thread Shree Devi Kumar
Hello Wincent, Thanks for the new version of package. No errors regarding font now and not slow either. Tested on Ubuntu. On Mon, Feb 10, 2020 at 12:28 AM Wincent Balin wrote: > Hello Shree, > > I just uploaded new version of the package. About the fixes: > > 1. --fonts_dir: I added the

Re: [tesseract-ocr] Re: Announcement: Python package pytesstrain (Tesseract training helpers)

2020-02-09 Thread Shree Devi Kumar
Re: max threads, please see https://github.com/tesseract-ocr/tesseract/issues/263#issuecomment-455614504 I will test the new scripts later and report back On Mon, Feb 10, 2020 at 12:28 AM Wincent Balin wrote: > Hello Shree, > > I just uploaded new version of the package. About the fixes: > >

Re: [tesseract-ocr] Re: Announcement: Python package pytesstrain (Tesseract training helpers)

2020-02-09 Thread Wincent Balin
Hello Shree, I just uploaded new version of the package. About the fixes: 1. --fonts_dir: I added the default value of the fonts directory on different platforms. 2. Amount of threads: I also capped the maximal amount of threads to the number of CPUs. Would you like to re-test it, please?

Re: [tesseract-ocr] Re: Announcement: Python package pytesstrain (Tesseract training helpers)

2020-02-04 Thread Shree Devi Kumar
> > By the way, I added a create_ground_truth utility, which creates .gt.txt > files as well as the associated .tif files for every specified font, to > the package. I think it could be useful for anyone who does not have a > ground truth collection yet. > > Thanks, I tried it with latest

Re: [tesseract-ocr] Re: Announcement: Python package pytesstrain (Tesseract training helpers)

2020-02-04 Thread Shree Devi Kumar
Thanks, Wincent. I will try out the tools added by you. I found a Unicode version of the ISRI evaluation tools at https://github.com/eddieantonio/ocreval which handles the high range Unicodepoints also. See

[tesseract-ocr] Re: Announcement: Python package pytesstrain (Tesseract training helpers)

2020-02-03 Thread Wincent Balin
Hi Shree, I am glad you find the package already useful :-) . As to your question: I did not use the ocr-evaluation tools, only the language_metrics utility. So, regrettably, I cannot help you here. But maybe you could try the same utility too? By the way, I added a create_ground_truth

[tesseract-ocr] Re: Announcement: Python package pytesstrain (Tesseract training helpers)

2020-01-28 Thread shree
Hi Wincent, Thank you for sharing these tools. I find create-dictdata to be very useful. I wanted to know if you have modified any ocr-evaluation tools to handle the high unicode range such as for Akkadian language. I was trying to test regarding Modi script (*Range*‎: ‎U+11600..U+1165F; (96