*A lot of times I have seen fairly good number plate images being OCRed inaccurately. This could possibly be due to the word recognition stage. Has anyone found a way to disable the dictionary / word recognition. * Saurabh, Have you been able to accomplish this ? Could you kindly share your insigths ? I have a similar need. Thanks a lot in advance. rgds, JV Iyer On Wednesday, February 16, 2011 10:48:56 PM UTC-6, Saurabh Gandhi wrote: > > Hello everyone, > > I am currently using tesseract 3.x for license plate recognition. > I have an algorithm which does a good job in pre-processing the input > image to localize the plate. > However, when I use the Tesseract OCR engine to classify the plate number, > the recognition is not that accurate. I have gone through the tesseract > whitepapers as well as some of the threads discussing the LPR using > tesseract. > > From all this, I have identified the following ways of improving the > results: > > 1. Customise the tesseract engine to recognize only the characters > from A-Z,0-9,.(dot), (space) by setting the character white-list. My > understanding is that the white-list is the list of characters that are > going to be sensed. I was inquisitive to know what the blacklist is meant > to do? > 2. A lot of times I have seen fairly good number plate images being > OCRed inaccurately. This could possibly be due to the word recognition > stage. Has anyone found a way to disable the dictionary / word recognition. > 3. Then there are some page segmentation modes > (PSM_AUTO,PSM_SINGLE_BLOCK, PSM_CHAR etc). Does PSM_CHAR imply that it > will > consider the input image as a single character and run the algorithm > accordingly without attempting word recognition? > 4. Another important configuration macro that I have seen within the > code was AVS_FASTEST = 0, AVS_MOST_ACCURATE = 100. However, I could not > find the same being used anywhere in the code. Does this have any impact > on > the *character recognition* accuracy? > 5. Finally, I also plan to use the confidence level data. Are there > any indicators of confidence for characters as well. There is word > confidence data which can be found in TessBaseAPI::AllWordConfidences(). > > Awaiting your valuable insights. > Thank you. > > Regards, > Saurabh Gandhi >
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en