I'm attempting to use htdig to index a secure intranet site, and I am running into a problem with the indexing of ms word files. It seems like the logic of the program cannot handle any files other than html, pdf, ps and txt when running in local_urls_only mode, there is a check for those extensions in Document.cc, if they are not found it returns Document_not_local to Retriever.cc, which marks the file as not found.
It seems like this is fixed in 3.2.0, but that is still in beta. Could the 4.8 FAQ entry be updated on the dig site with something like, "If you are using 3.1.6 along with local_urls_only you will not be able to index files other than html, pdf, ps or txt. You must use the 3.2.0 series." Maybe someone else won't have to spend the time looking up this bug again then. If this topic has been covered to death, sorry, the SF mailing list search is currently down for me so I couldn't search on this topic. I was also impressed by how understandable the code is for this project, it was incredibly easy to find the relevant parts in the code that dealt with the errors I was having. Thanks Josh -- Lake Agassiz Regional Library - Moorhead MN larl.org Josh Stompro | Office 218.233.3757 EXT-139 LARL Network Administrator | Cell 218.790.2110 ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

