I'm using htdig (3.1.5) to index a local collection of documents (html, 
pdf, ps). The collection grows slowly but regularly and I'd like to 
keep the index current. So far, my understanding was that I can update 
an existing index with htdig and htmerge. But I found that htdig seems 
to parse every document again and for this has to conver pdf and ps 
files to text again. This conversion is pretty time-consuming.

What I don't understand is, if it has to be that way or if I can set 
some config attribute to avoid this behavior. Using or omitting the -i 
option to htdig didn't make a difference.

My htdig.conf contains nothing spectacular

start_url:              http://localdocs/
local_urls:             http://localdocs/=/pub/doc/

Thus, if possible, htdig accesses files directly through the file 
system. local_urls_only is false by default; this way Apache generates 
directory listings.

Michael

-- 
Michael Schuerig                  If at first you don't succeed...
mailto:[EMAIL PROTECTED]           try, try again.
http://www.schuerig.de/michael/   --Jerome Morrow, "Gattaca"



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to