Hi everyone. I have successfully installed OCRopus onto a test machine and have been testing the software using various images. The purpose of this testing is to see the software at work and improve my understanding of the OCRopus project and OCR in general.
I have recorded my results here: http://wiki.fluidproject.org/display/fluid/OCRopus+0.3+Testing For those who do not know, OCRopus is OCR software that works by command line. You pass it an image as an argument and the output is dumped to the screen as HTML or you can redirect it to a file. The test images I used varied from photographed text of varying contrast and treatment (admittedly bad photos, but I was curious to see how OCRopus handles it), and scanned text with various layouts and font sizes. The results are very interesting with some output being empty despite the image being legible to the human eye (underscores the importance of proper exposure, white balance, and contrast of the input image). Having gone through this exercise, I wonder if there are any other adjustments / tweaks I can do from the command line that can improve the output? Or is the success of a "good" text conversion dependent on a clean input image? - Jonathan. PS. I will be cross posting this email to the OCRopus mailing list shortly, but with some minor adjustments. --- Jonathan Hung / [email protected] Fluid Project - ATRC at University of Toronto Tel: (416) 946-3002
_______________________________________________________ fluid-work mailing list - [email protected] To unsubscribe, change settings or access archives, see http://fluidproject.org/mailman/listinfo/fluid-work
