I am trying to get htdig to index pdf files. I have acroread 3.0
installed on our server (SGI running Irix 6.4). I installed the patch to
htdig as explained by Colin Viebrock. I changed my config file so that
pdf files would be indexed. When I look in the htdig log file, it does
indeed look like htdig is trying to index pdf files, but most of the pdf
files have an entry like
:7232:5223:4:http://www.weizmann.ac.il/CC/unixpages/Help-Reader.pdf:
/tmp/htdig2796.pdf: Could not repair file.
Some of them have entries
12712:12873:9:http://www.wisdom.weizmann.ac.il/Journal/Volume_4/PDF/v4i1r21.pdf:
/tmp/htdig2796.pdf: Expected a dict object.
Aside from installing the patch and taking pdf out of the bad_extensions
parameter, is there anything else that has to be done?
Thanks for any help.
Malki Cymbalista
Software Support, Weizmann Institute Computing Center
Rehovot, Israel 76100
Internet: [EMAIL PROTECTED]
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.