Hi, I'm not an expert with Tesseract, but I will give my point of view to see if it helps:
To recognize single chars avoiding layout analysis in Tesseract 3.0: http://markmail.org/message/km6fzufboilckjcf Before of that, you should split the image in order to have one image per char, you can achieve that in many ways, depending on the particularities of the images of your app. For example, getting the contours or blobs of the chars, then computing the bounding boxes and then taking the images inside that boxes. You can do that with OpenCV for example. Or you could perhaps scale up the images to see what happens (I'm not sure about this approach). Cheers, Andres -- 2011/5/24 Joyse1 <[email protected]> > Hi, > I'm training Tess with "0123456789" text in MicrosoftSansSerif font size > 8 ( small one ). I need to recognize small short text only. I have noticed > that Tess has some build in page layout analisys mechanism which thinks that > small short text is a noise and it produces empty page or just a part of the > text. Please write me how can I ommit it? What is the solution for this? > > Best > Jakub > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

