Robert Berman wrote: > Hello Emad, > > I have seriously looked at the documentation associated with pyPDF. This > seems to have the page as its smallest element of work, and what i need > is a line by line process to go from .PDF format to Text. I don't think > pyPDF will meet my needs but thank you for bringing it to my attention. > > Thanks, > > > Robert Berman
Have you looked at pdfminer? http://www.unixuser.org/~euske/python/pdfminer/index.html Looks promising. HTH, Marty > > Emad Nawfal (عماد نوفل) wrote: >> >> >> On Tue, Apr 21, 2009 at 12:54 PM, bob gailer <bgai...@gmail.com >> <mailto:bgai...@gmail.com>> wrote: >> >> Robert Berman wrote: >> >> Hi, >> >> I must convert a history file in PDF format that goes from May >> of 1988 to current date. Readings are taken twice weekly and >> consist of the date taken mm/dd/yy and the results appearing >> as a 10 character numeric + special characters sequence. This >> is obviously an easy setup for a very small database >> application with the date as the key, the result string as >> the data. >> >> My problem is converting the PDF file into a text file which I >> can then read and process. I do not see any free python >> libraries having this capacity. I did see a PDFPILOT program >> for Windows but this application is being developed on Linux >> and should also run on Windows; so I do not want to >> incorporate a Windows only application. >> >> I do not think i am breaking any new frontiers with this >> application. Have any of you worked with such a library, or do >> you know of one or two I can download and work with? >> Hopefully, they have reasonable documentation. >> >> >> If this is a one-time conversion just use the save as text feature >> of adobe reader. >> >> >> >> My development environment is: >> >> Python >> Linux >> Ubuntu version 8.10 >> >> >> Thanks for any help you might be able to offer. >> >> >> Robert Berman >> _______________________________________________ >> Tutor maillist - Tutor@python.org <mailto:Tutor@python.org> >> http://mail.python.org/mailman/listinfo/tutor >> >> >> >> -- Bob Gailer >> Chapel Hill NC >> 919-636-4239 >> >> _______________________________________________ >> Tutor maillist - Tutor@python.org <mailto:Tutor@python.org> >> http://mail.python.org/mailman/listinfo/tutor >> >> >> >> I tried pyPdf once, just for fun, and it was nice: >> http://pybrary.net/pyPdf/ >> -- >> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه >> كالحقيقة.....محمد الغزالي >> "No victim has ever been more repressed and alienated than the truth" >> >> Emad Soliman Nawfal >> Indiana University, Bloomington >> -------------------------------------------------------- > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor