On Wed, 1 Jan 2003, Michael Schuerig wrote: > I'm using htdig (3.1.5) to index a local collection of documents (html,
First, we highly suggest upgrading to 3.1.6, for stability and security reasons: http://www.htdig.org/RELEASE.html > an existing index with htdig and htmerge. But I found that htdig seems > to parse every document again and for this has to conver pdf and ps > files to text again. This conversion is pretty time-consuming. This is strange. If you're doing an update dig, htdig will send the modification time to the server in an If-Modified-Since header. Apache recognizes this and should not send a document unless it's been modified. > start_url: http://localdocs/ > local_urls: http://localdocs/=/pub/doc/ Hmm. I'm currently on vacation, so it's hard for me to check,but I wonder if the local_urls feature isn't checking the modification dates on the drive before reindexing. :-( One benefit/workaround in 3.1.6. Added to htdig is the -m flag, which allows you to index only a set of URLs. http://www.htdig.org/htdig.html So you could use 'find' to generate a list of paths to new or modified files, write it to a file and generate a list of "URLs" for indexing. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

