> I am trying to convert a pdf (and for that matter a > postscript) type file to plain text. Response to my earlier > mail suggested using File:Slurp, specifically; > > #!/usr/bin/perl > > use File::Slurp; > use CGI qw/:standard/; > > $pdf_guts = read_file("/path/to/my.pdf");
Yes $pdf_guts is the 'guts' of the pdf file. That was just a way for you to get the pdf contents into the script properly. After that, yes, you'd have to find a way to covert the contents Of the pdf file ($pdf_guts) to whatever you wanted. I thought I'd explained that in that message. There are two ways you could do this :: 1) See if there's a module to assist you. Did you look at http::search.cpan.org ?? 2) Pipe $pdf_guts to an external program that can translate pdf to whatever it is you want. I used this method to translate html to pdf via the prog htmldoc. So instead of $pdf_guts I had $html_guts and executed it in the script as I would have form the command line. But you have to find a program that converts it first. That's the part I'm not familiar with and I realize that is what you're asking about. So Hopefully that helps you get a better idea of what you have to do since it's not a complete solution. > > --------------------------------- > > In the above, however, $pdf_guts is an "unintelligible text > file" (same contents and size as the pdf). What I will It's not an "unintelligible text file", it is unintelligible, but a pdf is not a text file. It's the binary stuff that makes the pdf file the way it is. You'll need to have that if you are to translate it somehow. DMuey -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]