According to [EMAIL PROTECTED]: > I'm using htdig v3.1.6 on Mac OS X. Indexing is fine for HTML > documents, but I've configured EXTERNAL PARSING for PDF file and get > the following error: > --------- > URL: http://pdf.spiral.com/acfea/itineraries/BH011197-iti.pdf > External parser error: unknown field in line <TITLE>Mt. Holyoke > College Glee Club</TITLE> > --------- > htdig.conf has entry: external_parsers: application/pdf > /usr/local/bin/doc2html.pl
You didn't follow the instructions for doc2html.pl very carefully. doc2html is an external converter, not an external parser, so you need to tell htdig what file type will be produced. You should have... external_parsers: application/pdf->text/html /usr/local/bin/doc2html.pl See http://www.htdig.org/FAQ.html#q4.9 and http://www.htdig.org/attrs.html#external_parsers as well as the DETAILS file in contrib/doc2html. Without the "->" and target content-type, htdig will assume the parser will output preparsed records according to the external_parsers specification. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

