You've tried unicharambigs right (bottom of this page 
https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3)

On Thursday, 20 November 2014 12:53:43 UTC, Mark Beylis wrote:
>
> Hello
>
> I am making use of Tesseract OCR to perform number plate recognition on 
> vehicles
>
> I am making use of jTessBoxEditor v1.1 to check my box and tif files
>
> At the moment each iteration of my training consists of using about 250 - 
> 300 number plates
>
> I have read in many places that one should train fonts separately. This is 
> difficult in my case as my source of images of number plates consists of 
> number plates with varying font's unless I manually look through each one 
> of the 100 initial images I use per training iteration to separate them 
> into different groups. Would this really be neccessary?
>
> I have been doing training for over a month now and probably trained on 
> over 1000 images and 3000 number plates and seem to not be able to get a 
> better accuracy percentage of over 86%
>
> I was wondering if you have some suggestions as ideally I would like to 
> see in excess of 90% accuracy
>
> What I have picked up is that the OCR struggles with certain problem 
> characters : O vs 0, 5 vs S, 2 vs Z, B vs 8
>
> Is there a specific way of training that I should use to improve correct 
> reads of these letters. During my editting of the tif/box in jTessBoxEditor 
> I am torn between discarding the bad quality read characters and only 
> keeping the good quality read characters vs correcting each and every 
> character to be what it should be regardless of the quality of the 
> character in the tif file. Which is the better approach and why?
>
> Any other suggestions on how to improve my training using jTessBoxEditor 
> greatly appreciated
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/27499275-3170-42da-b984-4f4f44a376c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to