According to [EMAIL PROTECTED]:
> Is there some documentation on the format/content of the databases, as
> produced by htdig and htmerge?
>
> What I'd like to be able to do, if feasible, is to tell, from the databases
> themselves which url's have been indexed, and ideally the date on which this
> was done.
I don't think there's much documentation on the specific format of the
databases, other than the source code. I don't think the date on which
a document was last indexed is stored, but the last modified date is
stored in db.docdb. This date will be the date indexed for documents
where the server doesn't return a last modified date, e.g. for dynamic
content.
It would probably be pretty easy to build a simple docdb dumping tool
out of htnotify, which does a simple traversal through the database.
You could get it to output any field you want from the "DocumentRef"
object. Apart from that, I don't think any such tool exists yet, though
its on the to-do list for 3.2.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.