tesseract-ocr, a command line ocr package, been added to the cygwin
distribution.
The Tesseract OCR engine was originally developed at HP between 1985 and
1995. It was open-sourced by HP and UNLV in 2005 and Google has lead
further development.
The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV
Accuracy test. Between 1995 and 2006 it had little work done on it, but
it is probably one of the most accurate open source OCR engines
available. It will read a binary, grey or color image and output text.
Homepage: http://code.google.com/p/tesseract-ocr/
Notes:
* Built with libtiff, nevertheless it only accepts certain
tiff image formats. convert with -depth from the ImageMagick
package is my friend. I use convert <any> -depth 8 <any.tif>
* I haven't tried http://code.google.com/p/ocropus/
Packages:
tesseract-ocr
tesseract-ocr-devel
And the following languages as in debian:
tesseract-ocr-eng (default)
tesseract-ocr-deu
tesseract-ocr-deu-f (deutsch fraktur)
tesseract-ocr-fra
tesseract-ocr-ita
tesseract-ocr-nld
tesseract-ocr-por
tesseract-ocr-spa
tesseract-ocr-vie
If you have questions or comments, please send them to
the Cygwin mailing list at: cygwin@cygwin.com .
I'll answer only there and I don't answer private mails.
*** CYGWIN-ANNOUNCE UNSUBSCRIBE INFO ***
If you want to unsubscribe from the cygwin-announce
mailing list, look at the "List-Unsubscribe: " tag in
the email header of this message. Send email to the
address specified there. It will be in the format:
cygwin-announce-unsubscribe-you=yourdomain....@cygwin.com
If you need more information on unsubscribing, start
reading here:
http://sources.redhat.com/lists.html#unsubscribe-simple
Please read *all* of the information on unsubscribing
that is available starting at this URL.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple