On Mon, 24 Mar 2003, Anne Durand wrote:

> When it doesn't work, I get almost the same except the lines with word: ....

So what you're saying is that some PDF files work just fine, but others
don't seem to have any text. Some PDF files simply don't have text -- the
program that wrote the file sent graphics data, but not any sort of text.

In this case, I can extract text using pdftotext from the AthisMons.pdf
file, so there's something else going on. First off, I'd make sure you can
use your parser or converter. I don't recognize the
"parsepdf.pl" script--is it a renamed version of doc2html or something
else?

Check Q. 5.37 from the FAQ for a bit more:
http://www.htdig.org/FAQ.html#q5.37

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/









-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to