On Tuesday 21 April 2009 10:36:59 am Robert Berman wrote: > Bob, > > Thank you for the quick reply. I am acquainted with that method, and > that will certainly work to do some really serious testing; but, the > data collection is an ongoing process and the users are requesting that > every month the latest entries (8) are brought into the system. What is > rather irksome is that the output from the system cannot be changed from > PDF to text; so obviously I am going to have to resolve the situation at > my end. > > I am envisioning a simple program that once started reads the data file, > converts the data into text, and then sends the data to the database. > The program doesn't care if there are 8 test results or 80,000 test > results. That is why i am looking for a python module. > > Thanks again, > > Robert Berman > > bob gailer wrote: > > Robert Berman wrote: > >> Hi, > >> > >> I must convert a history file in PDF format that goes from May of > >> 1988 to current date. Readings are taken twice weekly and consist of > >> the date taken mm/dd/yy and the results appearing as a 10 character > >> numeric + special characters sequence. This is obviously an easy > >> setup for a very small database application with the date as the > >> key, the result string as the data. > >> > >> My problem is converting the PDF file into a text file which I can > >> then read and process. I do not see any free python libraries having > >> this capacity. I did see a PDFPILOT program for Windows but this > >> application is being developed on Linux and should also run on > >> Windows; so I do not want to incorporate a Windows only application. > >> > >> I do not think i am breaking any new frontiers with this application. > >> Have any of you worked with such a library, or do you know of one or > >> two I can download and work with? Hopefully, they have reasonable > >> documentation. > > > > If this is a one-time conversion just use the save as text feature of > > adobe reader. > > > >> My development environment is: > >> > >> Python > >> Linux > >> Ubuntu version 8.10 > >> > >> > >> Thanks for any help you might be able to offer. > >> > >> > >> Robert Berman
On linux pdftotext is available and you might want to check out ghostscript which runs on windows and linux. -- John Fabiani _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor