According to [EMAIL PROTECTED]:
> Hello Gilles:
> 
> Thanks for pointing out my faulty external parser configuration yesterday....
> I'm successfully indexing over 350 pdf files, however we have about a 
> dozen where we get the 'Error (0): PDF file is damaged..." error. 
> Each of these files display correctly in Acrobat on PC's and Mac's, 
> perhaps coincidentally these errors occur on PDF files created on a 
> Macintosh. The error occurs running pdf2html.pl directly or via HTDIG 
> with a large max_doc_size. I've read through the archives and don't 
> think it's the max_docsize. Any suggestions?

If you get the error while running pdf2html.pl directly, then it's not
a problem with max_doc_size, because pdf2html.pl doesn't use that (or
any) config attribute.  The definitive test would be to run pdftotext
and/or xpdf directly on one of the PDF files that's giving you problems.
Likely that would give you the same error.

I'd recommend you first make sure you're running a recent version of
xpdf on your system.  Current version is 1.01.  If you're running an
older version, it's worth trying a more recent one to see if it can
handle PDFs that older versions couldn't.  If the latest version still
has problems with some PDFs that you know are correct and readable in
Acrobat, then you may want to contact xpdf's author about this.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to