Needless to say this is a difficult image. For a start the angle at which the picture is taken is skewed, the plastic is squished on the right. There is god knows how much other text noise in and around the image, and then there's just natural scene noise - edges, shading, lines etc. Tesseract does not like this kind of image.
You have to whittle your input to Tesseract down to as clean an image as possible. I have tried cropping your image right back to the white of the areas you suggest and got at best: 0:669 S$29 i 1535 10.0991 This probably better than you got but not accurate enough - I think you need to think hard about how best to extract the zone you are after first. This design is fairly common in UK food so perhaps you can somehow recognise this part of the input image and then crop it out, then do a further crop using line detection to get the individual pieces out. Having said that, even a well-spaced cropped 1st element: [image: Inline images 1] Is returning 0.69 - the 'g' coming out as a 9 - you might fix this with training on this font however as the height of the lower case g is unusually high. Cheers On 28 July 2016 at 18:16, Douglas Millward <djm...@gmail.com> wrote: > Hi > I'm new to this forum and I've searched for a similar topic - excuse me if > i've missed anything relevant. > I want to OCR the 'traffic light' nutrition information on food packaging > - its basically numbers with a small g - an example is attached. I have > processed it through tesseract and I just get gobbledegook. Do I need to > train it to read this format? And if so (heres hoping) has anyone done > anything similar? > Any pointers welcome > > kind regards > > Doug > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/4d16df2f-cf10-4550-bf89-6c568805ab4a%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/4d16df2f-cf10-4550-bf89-6c568805ab4a%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAORW5vjJ6JF1ZZ84c-VjinWnjhpsRnJLALKGLPVc%2BRXZOjnegw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.