Hi everyone.

I have successfully installed OCRopus onto a test machine and have been
testing the software using various images. The purpose of this testing is to
see the software at work and improve my understanding of the OCRopus project
and OCR in general.

I have recorded my results here:
http://wiki.fluidproject.org/display/fluid/OCRopus+0.3+Testing

For those who do not know, OCRopus is OCR software that works by command
line. You pass it an image as an argument and the output is dumped to the
screen as HTML or you can redirect it to a file.

The test images I used varied from photographed text of varying contrast and
treatment (admittedly bad photos, but I was curious to see how OCRopus
handles it), and scanned text with various layouts and font sizes. The
results are very interesting with some output being empty despite the image
being legible to the human eye (underscores the importance of proper
exposure, white balance, and contrast of the input image).

Having gone through this exercise, I wonder if there are any other
adjustments / tweaks I can do from the command line that can improve the
output? Or is the success of a "good" text conversion dependent on a clean
input image?

- Jonathan.

PS. I will be cross posting this email to the OCRopus mailing list shortly,
but with some minor adjustments.

---
Jonathan Hung / [email protected]
Fluid Project - ATRC at University of Toronto
Tel: (416) 946-3002
_______________________________________________________
fluid-work mailing list - [email protected]
To unsubscribe, change settings or access archives,
see http://fluidproject.org/mailman/listinfo/fluid-work

Reply via email to