Image pre-processing for good OCR results

Hi,

My project at http://RecordAGrave.com is about recording headstones from 
graves and posting the text and images on the Net so that people can 
research their family history.  I would appreciate some advice on how to 
pre-process these headstone images to get the best results from Tesseract 
OCR.  I have thousands of 1-2 MB jpg images of headstones to process.


Example images:
http://freepages.genealogy.rootsweb.ancestry.com/~janderse/cemeteries/Star%20of%20David%20Memorial%20Gardens/Garden%20of%20Haifa%20-%20Raw/IMG_28215.jpg
http://freepages.genealogy.rootsweb.ancestry.com/~janderse/cemeteries/Star%20of%20David%20Memorial%20Gardens/Garden%20of%20Haifa%20-%20Raw/IMG_28216.jpg
http://freepages.genealogy.rootsweb.ancestry.com/~janderse/cemeteries/Star%20of%20David%20Memorial%20Gardens/Garden%20of%20Haifa%20-%20Raw/IMG_28217.jpg
I am a software developer so I can script up pre-processing steps to prepare 
the input for Tesseract.

Any advice on improving OCR accuracy through pre-processing steps?

Thanks so much,

-Jon

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Image pre-processing for good OCR results

Reply via email to