Came across this in an article on lwn.net: http://jocr.sourceforge.net No idea whether it would help, but take a look.
BTW, the article was about how a Spamassassin plugin is using this to scan for spammery in images. Quite interesting. Owen On Tue, Aug 22, 2006 at 12:05:34PM -0400, James Tuttle wrote: > Hello: > > I'm working to script some processes at work and have come across a > problem creating pdfs from tiff images. I can easily do it with > 'convert' from imagemagick, but this yields an image pdf rather than a > text pdf which means one can't search it, select text, or full-text > index the pdf. I was wondering if anyone has any advice about how to > integrate ocr into the process. Alternately, I've been given a copy of > Acrobat 7 Pro, but it doesn't seem to have a scriptable API. > > Any ideas? > > Thanks, > Jim -- TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug TriLUG Organizational FAQ : http://trilug.org/faq/ TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
