is there howto to optimise text extraction from non-document images?

2013-04-29 Thread Jonathan Chetwynd
I have a number of webcam images of road signage, tesseract-ocr output is highly variable, how to optimise? for example http://peepo.com/pics/ocr/road_signs.jpg outputs West End Barbican Exhibition -9 Halls and http://peepo.com/pics/ocr/when_red.png 'when red light shows stop here' outputs

whereis howto for optimising text extraction from non-document images?

2013-04-28 Thread Jonathan Chetwynd
Is there a document explaining how to tweak tesseract to get the most from non-document type images? I have a collection of images containing small amounts of text in signage from webcam images. I find results extremely variable, and without clear or easily understood cause. http://peepo.com/