Ok. Thanks. Im using the acroread parser
pdf_parser: /usr/local/Acrobat/4/bin/acroread so that is probably the first problem which I should take care of.... Thanks, Curt Gilles Detillieux wrote: > > According to Curtis J. Peredina: > > When I index PDF's the summary lines in the results pages are appearing > > like binary characters. > > > > Any good tips for removing this? > > Well, that depends a whole lot on how you're indexing PDFs, and what kind > of PDFs you're indexing. If you use doc2html.pl or conv_doc.pl, along > with pdftotext from the xpdf 0.92 package, as I do, you're already using > the most up to date technique to index PDFs. Some PDFs just use strange > encodings for some fonts, which pdftotext can't decypher. We have 3 such > PDFs on our SCRC web site (search for "presentation"), which I was unable > to do anything about. > > If you're experiencing this problem with most or all PDFs, or if you get > meaningful text from these PDFs when you run pdftotext manually on them, > then the problem may lie elsewhere. In this case, you should post more > details about how you've configured htdig to index PDFs, after carefully > trying out the suggestions in http://www.htdig.org/FAQ.html#q4.9 > > -- > Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> > Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ > Dept. Physiology, U. of Manitoba Phone: (204)789-3766 > Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 > > _______________________________________________ > htdig-general mailing list <[EMAIL PROTECTED]> > To unsubscribe, send a message to <[EMAIL PROTECTED]> with >a subject of unsubscribe > FAQ: http://htdig.sourceforge.net/FAQ.html _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

