At 11:50 AM -0500 12/4/99, Tom Metro wrote:
>not very nicely formatted. Has any thought been given to having htdig
>generate a Common Log Format (CLF) log file?
Yup. It's been tossed around a few times. It's just not high on
anyone's todo list right now.
>having to perform a dig, and what I'd really like would be a tool that
>could produce a list of indexed documents after-the-fact (so a
>database generated by an overnight cron job could be examined).
One of my projects for after the 3.2.0b1 release is to roll a bunch
of tools like this into a directory called httools. Both htmerge and
htnotify will be moved in, and htmerge would be simplified--it would
only merge databases. Right now I see a need for htmerge, htnotify,
htpurge, htdump, htrestore, and maybe one or two more.
>[I gave it a spin. The answer appears to be: <db-dir>/db.urls and that
>it gets overwritten each run. More importantly, I noticed that
>off-site URLs and mailto: URLs were included, which was not what I
>expected. Looking at the documentation, it does say "URLs that were
>seen" rather than URLs that were retrieved, as I was thinking. Perhaps
>that needs to be emphasized in the documentation. So I guess I'm out
>of luck if I want a list of URLs that were dug? (unless I filter the
>output from -v)]
Actually, I use -t and pipe that through some stuff to generate a
list of URLs that were dug. But no, it's not particularly easy. Just
like it's not particularly easy to purge a set of documents from the
databases...
-Geoff
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.