Re: [htdig] Binary Characters in Summary

Curtis J. Peredina Fri, 12 Oct 2001 11:28:58 -0700

Ok. Thanks.

Im using the acroread parser


pdf_parser:             /usr/local/Acrobat/4/bin/acroread


so that is probably the first problem which I should take care of....

Thanks,

Curt
Gilles Detillieux wrote:
> 
> According to Curtis J. Peredina:
> > When I index PDF's the summary lines in the results pages are appearing
> > like binary characters.
> >
> > Any good tips for removing this?
> 
> Well, that depends a whole lot on how you're indexing PDFs, and what kind
> of PDFs you're indexing.  If you use doc2html.pl or conv_doc.pl, along
> with pdftotext from the xpdf 0.92 package, as I do, you're already using
> the most up to date technique to index PDFs.  Some PDFs just use strange
> encodings for some fonts, which pdftotext can't decypher.  We have 3 such
> PDFs on our SCRC web site (search for "presentation"), which I was unable
> to do anything about.
> 
> If you're experiencing this problem with most or all PDFs, or if you get
> meaningful text from these PDFs when you run pdftotext manually on them,
> then the problem may lie elsewhere.  In this case, you should post more
> details about how you've configured htdig to index PDFs, after carefully
> trying out the suggestions in http://www.htdig.org/FAQ.html#q4.9
> 
> --
> Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
> Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
> Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
> Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
> 
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> To unsubscribe, send a message to <[EMAIL PROTECTED]> with 
>a subject of unsubscribe
> FAQ: http://htdig.sourceforge.net/FAQ.html

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Re: [htdig] Binary Characters in Summary

Reply via email to