Tesseract is an open source OCR program. It can already produce searchable PDF and will soon support streaming. It would be fun to support something like this:
scanimage --batch | tesseract - - pdf > searchable.pdf To make this work nicely, scanimage would need to print the name of each file to stdout after it is written. Thoughts? Jeff
-- sane-devel mailing list: sane-devel@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel Unsubscribe: Send mail with subject "unsubscribe your_password" to sane-devel-requ...@lists.alioth.debian.org