Tesseract is an open source OCR program. It can already
produce searchable PDF and will soon support streaming.
It would be fun to support something like this:

   scanimage --batch | tesseract - - pdf > searchable.pdf

To make this work nicely, scanimage would need to
print the name of each file to stdout after it is written.

Thoughts?

Jeff
-- 
sane-devel mailing list: sane-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
             to sane-devel-requ...@lists.alioth.debian.org

Reply via email to