At 11:27 AM -0500 12/4/99, Tom Metro wrote:
>I don't even know what your database schema looks like (one of the
>reasons why I suggested having an "architectural overview" document),

See htcommon/DocumentRef.h if you want to know what fields are stored 
in the document database. Yes, we probably need to write up a doc on 
the database formats, but in the 3.2 development this changed fairly 
frequently.

>but are document titles compressed using one of these schemes? Is it
>part of the DocHead that is compressed with zlib?

The DocHead is the entire stored excerpt--essentially the file 
stripped of any markup. The entire field is compressed with zlib.

>Decoding common_url_parts reliably with an external script will be
>tricky because even if you parse htdig.conf and find it absent, you
>still need to keep in sync with htdig's compiled-in defaults, which
>may have changed since the script was written.
>
>Are you aware of any Perl scripts that have been written to decompress
>the URLs?

I don't think so. This is one reason I suggested an XS module might 
help. It could call the C++ routines directly, forgoing any need to 
decode the compiled-in defaults. Then again, I'm not much of a Perl 
expert.

-Geoff


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 

Reply via email to