At 11:27 AM -0500 12/4/99, Tom Metro wrote:
>I don't even know what your database schema looks like (one of the
>reasons why I suggested having an "architectural overview" document),
See htcommon/DocumentRef.h if you want to know what fields are stored
in the document database. Yes, we probably need to write up a doc on
the database formats, but in the 3.2 development this changed fairly
frequently.
>but are document titles compressed using one of these schemes? Is it
>part of the DocHead that is compressed with zlib?
The DocHead is the entire stored excerpt--essentially the file
stripped of any markup. The entire field is compressed with zlib.
>Decoding common_url_parts reliably with an external script will be
>tricky because even if you parse htdig.conf and find it absent, you
>still need to keep in sync with htdig's compiled-in defaults, which
>may have changed since the script was written.
>
>Are you aware of any Perl scripts that have been written to decompress
>the URLs?
I don't think so. This is one reason I suggested an XS module might
help. It could call the C++ routines directly, forgoing any need to
decode the compiled-in defaults. Then again, I'm not much of a Perl
expert.
-Geoff
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.