I'm attempting to use htdig to index a secure intranet site, and I am 
running into a problem with the indexing of ms word files.  It seems 
like the logic of the program cannot handle any files other than html, 
pdf, ps and txt when running in local_urls_only mode, there is a check 
for those extensions in Document.cc, if they are not found it returns 
Document_not_local to Retriever.cc, which marks the file as not found. 

It seems like this is fixed in 3.2.0, but that is still in beta.  Could 
the 4.8 FAQ entry be updated on the dig site with something like, "If 
you are using 3.1.6 along with local_urls_only you will not be able to 
index files other than html, pdf, ps or txt.  You must use the 3.2.0 
series."   Maybe someone else won't have to spend the time looking up 
this bug again then.

If this topic has been covered to death, sorry, the SF mailing list 
search is currently down for me so I couldn't search on this topic.  I 
was also impressed by how understandable the code is for this project, 
it was incredibly easy to find the relevant parts in the code that dealt 
with the errors I was having.
Thanks
Josh

-- 
Lake Agassiz Regional Library - Moorhead MN larl.org
Josh Stompro               | Office 218.233.3757 EXT-139
LARL Network Administrator | Cell 218.790.2110  



-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
ht://Dig general mailing list: <[email protected]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to