external_parsers: \
application/pdf /usr/local/bin/doc2html
now, I get hundreds of error messages,
External parser error: unknown field in line <HTML>
URL: .... vcdstory.pdf
External parser error: unknown field in line <HEAD>
URL: .... vcdstory.pdf
....
It's not clear to me why this should be so hard.
Geoff Hutchison wrote:
On Thu, 26 Dec 2002, Michael Friendly wrote:--
I've read the FAQ on this topic, but still can't get rundig to index pdf files. I have setNo, I don't think this is what you want to do. The pdf_parser attribute is
max_doc_size: 500000
pdf_parser: /usr/bin/htdig-pdfparser
debian_pdf_parser: xpdf
and verified that pdftotext works from the command line on my debian
now quite depreciated--it really, truly expects Acrobat-generated PS
files.
I'd look at the FAQ again (specifically q4.9):
http://www.htdig.org/FAQ.html#q4.9
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
Michael Friendly Email: [EMAIL PROTECTED] Professor, Psychology Dept.
York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT M3J 1P3 CANADA
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

