[htdig] DOC and PDF problem

Ninti Systems Mon, 05 Apr 2004 00:49:10 -0700

After a bit of fiddling, I now have HtDig running the external parsers
required for PDF and DOC files. All parsers work when run from the
command line. max_file_size is OK.


HtDig is finding these file types and apparently indexing them at least
partly (titles and/or metadata only it seems).

rundig -v -v -v is showing no problems, the files are in the output
along with everything else without complaints.

A search can turn the files up if search terms are carefully selected. 
The documents are listed with their titles enclosed in square brackets.
The excerpt text however is not useful, it is simply a repeated ("Read
8192" or something similar). 
It appears that the content is not being indexed even though the system
doesn't actually complain of or indicate any specific problems.

Anyone seen/solved this before?

TIA, Mick





-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

[htdig] DOC and PDF problem

Reply via email to