Re: [tesseract-ocr] Incremental Training Tesseract 4.0+ for fraktur

2020-04-03 Thread hmaster
Hi Val, How did you generate the 6k .gt.txt files from the tif files? Thank you. On Wednesday, 29 January 2020 14:02:40 UTC, Val LNB wrote: > > Thank you for the link! > > I found the following example: >

Re: [tesseract-ocr] Incremental Training Tesseract 4.0+ for fraktur

2020-01-29 Thread Shree Devi Kumar
tesseract 4.0.0-beta.1 This is quite old. I suggest you use latest build. Not sure if @stweil is actively watching this forum. You can post a question in tesstrain repo. On Wed, Jan 29, 2020 at 7:32 PM Val LNB wrote: > Thank you for the link! > > I found the following example: >

Re: [tesseract-ocr] Incremental Training Tesseract 4.0+ for fraktur

2020-01-29 Thread Val LNB
Thank you for the link! I found the following example: https://github.com/tesseract-ocr/tesstrain/wiki/GT4HistOCR#finetuning-based-on-scriptfraktur Here are instructions that I have figured out so far for fine-tuning an existing model: On Ubuntu 18.04 first I double checked for right packages

Re: [tesseract-ocr] Incremental Training Tesseract 4.0+ for fraktur

2020-01-28 Thread Shree Devi Kumar
Please see https://github.com/tesseract-ocr/tesstrain/wiki There are already newly trained models by @stweil for Fraktur. On Tue, Jan 28, 2020, 22:46 Val LNB wrote: > *How to perform incremental training on Tesseract 4.0+?* > > > I want to improve the existing fraktur (frk) model with some

[tesseract-ocr] Incremental Training Tesseract 4.0+ for fraktur

2020-01-28 Thread Val LNB
*How to perform incremental training on Tesseract 4.0+?* I want to improve the existing fraktur (frk) model with some 6000 hand curated lines from our library. Ground truth for these lines has 10 new unicode characters not present in German fraktur model. How can I continue training from