Yes, this should certainly be better if you send us one or more example images. Adding to what have already been said, one thing can be noted for sure: Tesseract tries to treat everything as a known character, even schematics or line art. These formations usually appear as garbage in the output. To get rid of this effect you should manually or programmatically extract input image areas containing solely text and supply them to Tesseract.
Warm regards, Dmitri Silaev www.CustomOCR.com On Wed, Oct 19, 2011 at 7:42 PM, Joao Henriques <joao.lhenriq...@googlemail.com> wrote: > Hello everybody, > > I hope that someone can help me out here. > There was nothing on the net regarding it, so'll just try it here :) > > I have a picture that needs to be OCR'ed. > The picture contains a schematic and some numbers. > I need to retrieve only these numbers from the picture. > > Tesseract keeps trying to analyse also the schematic, which results in > a big mess :) > Is there any way to get tesseract to only search the picture for > numbers? > Or was tesseract written only to analyse pictures with lines of text? > > Regards, > Joao > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to tesseract-ocr@googlegroups.com > To unsubscribe from this group, send email to > tesseract-ocr+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en