rbt said the following on 2/22/2005 8:53 AM:
Not really a Python question... but here goes: Is there a way to read the content of a PDF file and decode it with Python? I'd like to read PDF's, decode them, and then search the data for certain strings.

Thanks, rbt

Hi,

Try pdftotext which is part of the XPdf project. pdftotext extracts textual information from a PDF file to an output text file of your choice. I have used it in the past (not with Python) to do what you are attempting. It is a small program and you can invoke from python and search for the string/pattern you want.

You can download for your OS from:
http://www.foolabs.com/xpdf/download.html

Thanks,
-Kartic
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to