I just noticed that the nightly update (-a) dig of one of my mailing list archives (produced my mhonarc) misses new messages added to the archive. I have a suspicion why this happens. Can anybody confirm this?
In the archive the index pages (thread and date index) have the meta tag <META name=robots content="noindex,follow"> because I don't want these pages to show up in search results themselves, but they are the (only) way htdig can find all messages. From looking at the debug output of the nightly update digs, it seems that htdig in -a mode just checks all documents in the database for changes. But the index pages are excluded from the database, so it never checks those for new links. Shouldn't it either traverse the document space starting with the start URLs also for update digs (in the same way it does with an initial dig), or maybe keep a list of excluced documents and check those as well? That is, while an exluded document should not be used by htsearch, it should still be checked by an update dig. Roman Maeder _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

