On Fri, 4 May 2001 [EMAIL PROTECTED] wrote:
> I'm indexing some PDF-documents without some probs. (the maxdocsize is
> 5MB!) But when I run htmerge there is the following error: 'deleted no
> excerpt'. I think that the PDF-files were indexed first, but the
> update in the db does not work. What ist to do?
There are two possible problems:
1) Are you sure all your PDF files are smaller than this size? If not,
then it will not be totally retrieved from the server and it probably
won't be successfully parsed.
2) Are you sure the PDF actually has text in it? Can you run the external
parser or converter that you're using manually to see text? Many PDF files
look like text, but they're actually just bitmapped graphics or outlines
without the text information. :-(
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html