Mark, Did you find a solution to line below(extracted from your original msg) ? If so , please let me know. Thanks
*What I have picked up is that the OCR struggles with certain problem characters : O vs 0, 5 vs S, 2 vs Z, B vs 8* On Thursday, November 20, 2014 7:53:43 AM UTC-5, Mark Beylis wrote: > > Hello > > I am making use of Tesseract OCR to perform number plate recognition on > vehicles > > I am making use of jTessBoxEditor v1.1 to check my box and tif files > > At the moment each iteration of my training consists of using about 250 - > 300 number plates > > I have read in many places that one should train fonts separately. This is > difficult in my case as my source of images of number plates consists of > number plates with varying font's unless I manually look through each one > of the 100 initial images I use per training iteration to separate them > into different groups. Would this really be neccessary? > > I have been doing training for over a month now and probably trained on > over 1000 images and 3000 number plates and seem to not be able to get a > better accuracy percentage of over 86% > > I was wondering if you have some suggestions as ideally I would like to > see in excess of 90% accuracy > > What I have picked up is that the OCR struggles with certain problem > characters : O vs 0, 5 vs S, 2 vs Z, B vs 8 > > Is there a specific way of training that I should use to improve correct > reads of these letters. During my editting of the tif/box in jTessBoxEditor > I am torn between discarding the bad quality read characters and only > keeping the good quality read characters vs correcting each and every > character to be what it should be regardless of the quality of the > character in the tif file. Which is the better approach and why? > > Any other suggestions on how to improve my training using jTessBoxEditor > greatly appreciated > > Thanks > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8a371a2f-c5b4-44c4-af2c-ccb2670e5723%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.