At 11:50 AM -0500 12/4/99, Tom Metro wrote:
>not very nicely formatted. Has any thought been given to having htdig
>generate a Common Log Format (CLF) log file?

Yup. It's been tossed around a few times. It's just not high on 
anyone's todo list right now.

>having to perform a dig, and what I'd really like would be a tool that
>could produce a list of indexed documents after-the-fact (so a
>database generated by an overnight cron job could be examined).

One of my projects for after the 3.2.0b1 release is to roll a bunch 
of tools like this into a directory called httools. Both htmerge and 
htnotify will be moved in, and htmerge would be simplified--it would 
only merge databases. Right now I see a need for htmerge, htnotify, 
htpurge, htdump, htrestore, and maybe one or two more.

>[I gave it a spin. The answer appears to be: <db-dir>/db.urls and that
>it gets overwritten each run. More importantly, I noticed that
>off-site URLs and mailto: URLs were included, which was not what I
>expected. Looking at the documentation, it does say "URLs that were
>seen" rather than URLs that were retrieved, as I was thinking. Perhaps
>that needs to be emphasized in the documentation. So I guess I'm out
>of luck if I want a list of URLs that were dug? (unless I filter the
>output from -v)]

Actually, I use -t and pipe that through some stuff to generate a 
list of URLs that were dug. But no, it's not particularly easy. Just 
like it's not particularly easy to purge a set of documents from the 
databases...

-Geoff


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 

Reply via email to