According to Curtis J. Peredina: > With the default search results, when the result is a PDF, I get the > following: > [nxtrend.pdf] <stars here> > n6PYmD&'Z8P=@R].E'SKC-pLR:I[gjIO(S0?,*o+U!Qo=T%Oi*TJ > <appropriate URL is fine here> > > What step generates the summary characters? Ive been debugging this for > a while to no avail. I've combed the FAQ, but it handles binary return > strings in the more general sense. All other docs seem ok. This just > affects PDF's.
The excerpt is generally collected by the same procedure that collects words to be indexed. The possible exception to this is an external parser script (as opposed to an external converter), which can generate the "h" record completely independently from the "w" records. In a later message, you indicated that you're using acroread: > pdf_parser: /usr/local/Acrobat/4/bin/acroread Well, Acrobat 4 has a great deal of problems, but if it's not crashing on you, then it may be that other tools for reading PDFs won't have more success with these files. I'm guessing that your PDFs use strange font encodings. Still, it can't hurt to try pdftotext, and if that works, use it with an external converter script like doc2html. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

