According to Tod Thomas: > On one of our machines we have migrated to a different search > technology. However, I have constructed a real nice dead link report > from the htdig log and want to keep that running. The only problem is I > have to keep re-indexing the site which takes up a LOT of space for the > db files. I tried symlinking them to /dev/null but it fails, which I > kind of expected. > > Is there a way to run htdig but not have it create its databases just so > I can get the log created. I know this might sound like over kill - I > could probably do something along the same lines with wget, or some > other spidering program - but I've got enough customization in place to > make it a little painful to explore alternatives. > > Any ides?
htdig needs its docdb to track what it's doing, but you can drastically reduce the size of that file by setting max_head_length and max_meta_description_length to 0, and set minimum_word_length to something large. Also, if you're using 3.1.x, you can probably safely link db.wordlist to /dev/null to avoid keeping any words. I'm assuming you're not running htmerge after htdig in this case. See also http://www.htdig.org/FAQ.html#q4.6 -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) ------------------------------------------------------- This sf.net email is sponsored by: To learn the basics of securing your web site with SSL, click here to get a FREE TRIAL of a Thawte Server Certificate: http://www.gothawte.com/rd524.html _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

