[tesseract-ocr] Re: Install Tesseract 4 on CentOS and Red Hat [SOLVED!]

2018-09-06 Thread Periasamy Kanagavel
I am new to Cent OS. I am trying the steps mentioned here. Upto the step " PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ...", there were no issues. While running the command "LDFLAGS="-L/usr/local/lib" CFLAGS= "-I/usr/local/include" make -j", I was getting "libtool: Version mismatch error. This is l

[tesseract-ocr] Re: Fine tuning existing model

2018-09-06 Thread Raniem
Thanks for the detailed answer, I am giving it a shot and hoping for getting some better results :) Thanks for all your help and support Best Regards On Friday, June 29, 2018 at 1:01:08 PM UTC+1, Lorenzo Blz wrote: > > ​​ > > Hi, > I'm trying to do fine tuning of an existing model using line i

Re: [tesseract-ocr] Re: Fine tuning existing model

2018-09-06 Thread Lorenzo Bolzani
Hi Raniem, I did 5 fine tunings for different fonts and text content with roughly these numbers: iterations: samples (training data) 750:208 numbers (4 upper case + 5 digits each) 1000: 400 MRZ codes (22 uppercase chars each) 1800: 1000 numbers (10 digits each) 2250

Re: [tesseract-ocr] Making custom traineddata

2018-09-06 Thread Shree Devi Kumar
> When it's combining language model I've spotted that it's making some dawg files. Yes, it takes the files from langdata repo specified in the training command. You could change langdata/pol/pol.wordlist to have only the LAST NAMES and GIVEN NAMES, pol.punc to have only < and change number forma

[tesseract-ocr] Re: Fine tuning existing model

2018-09-06 Thread Raniem
Hi @ Lorenzo Blz How many data lines and iterations have you used in your fine tuning. In your last reply you have mentioned you replaced merge_unicharsets $(TESSDATA)/$(CONTINUE_FROM).lstm-unicharset $(TRAIN)/my.unicharset "$@" with: cp "$(TRAIN)/my.unicharset" "data/unicharset" which is

Re: [tesseract-ocr] Making custom traineddata

2018-09-06 Thread kaminski . robert . it
Thank you for your reply Shreeshrii! Indeed finetune method is much much better solution for my problem. Thanks to your logs and data provided in repo I realized that I don't need to generate every single MRZ code separately (I'm sure it was mentioned somewhere ). In fact the process of making