On Fri, 18 Feb 2005, Janine Sisk wrote:

On Feb 17, 2005, at 8:58 PM, Jim wrote:

On Thu, 17 Feb 2005, Chuck Phillips (Console, Inc.) wrote:

I expected that enabling use_doc_date would make my modified rundig (no -i, no -a) only update the index for pages that have newer meta dates.

I don't think that use_doc_date is intended to be used in this way.

So then what is the "correct" way to do this? I have a site that takes about 30 hours to fully index, so obviously I'd like to just do an update most of the time, but it sounds like this isn't going to be as easy as dropping the -i and -a. I've run that a couple of times on a subset of my pages and it looks like they are all being processed each time.

The technique htdig uses to determine whether a document needs to be reindexed involves the 'If-Modified-Since:' header. If the server you are contacting respects this header and returns correct last-modified dates when documents are retrieved, then dropping the -i option should prevent unmodified documents from being reindexed. If the server is not configured to meet these conditions, or the pages are dynamically generated in a manner that results in there being no associated date of last-modification, then htdig assumes that the document needs to be reindexed.

Jim


------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to