On Thu, 17 Feb 2005, Chuck Phillips (Console, Inc.) wrote:

I expected that enabling use_doc_date would make my modified rundig (no -i, no -a) only update the index for pages that have newer meta dates.

I don't think that use_doc_date is intended to be used in this way. By the time htdig knows anything about the date/time in the meta tag it has already checked the response header, retrieved the document, and started parsing out content. In many cases I am not sure there is a lot to be gained by bailing out at that point. Not claiming that is true in your case, but in the general case you have already paid the price for hitting the network, setting up the structures for the document, and performing some amount of text parsing.

As far as I know only the date returned in the response header is
considered when determining whether to reindex based on modification time.

I read in an archive that this output isn't confirmation that htdig is using the meta date, that if it fails and defaults to now it will do so silently.

True. But I think that can only happen if the code fails to parse the string in the content attribute. The value of your content attribute looked fine. If what you see in the time: output is in the correct format, then I think it is safe to assume that the date was associated with the document when indexed.

Jim


------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to