On Mon, 24 Mar 2003, Anne Durand wrote: > When it doesn't work, I get almost the same except the lines with word: ....
So what you're saying is that some PDF files work just fine, but others don't seem to have any text. Some PDF files simply don't have text -- the program that wrote the file sent graphics data, but not any sort of text. In this case, I can extract text using pdftotext from the AthisMons.pdf file, so there's something else going on. First off, I'd make sure you can use your parser or converter. I don't recognize the "parsepdf.pl" script--is it a renamed version of doc2html or something else? Check Q. 5.37 from the FAQ for a bit more: http://www.htdig.org/FAQ.html#q5.37 -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

