Re: [htdig3-dev] Berkeley DB2 and Perl scripts

Geoff Hutchison Fri, 3 Dec 1999 17:50:21 -0800

At 7:59 PM -0500 12/3/99, Tom Metro wrote:
>  > But beyond using Berkeley DB2, the code now encodes/compresses URLs
>  > as well as excerpts.
>Right, good point. I noticed that when browsing through the attributes
>documentation. You use zlib now, but previously you used
>url_part_aliases and common_url_parts as a simple form of compression -
>right? Although I didn't see any evidence that the existing Perl
>scripts handled any form of compression. Do they predate all forms of
>compression or are they written with the assumption that compression
>is turned off?

You don't have that quite right. The existing Perl scripts didn't 
handle compression because compression, url_part_aliases, and 
common_url_parts all came in after they were written.

But to clarify your point, zlib and u_p_a and c_u_p are used on 
different things. The first is used *solely* on document excerpts 
(the DocHead field), while the latter two are used on URLs in both 
the document database and the document index (the URL->DocID list).

So there are two steps to decoding an entry--first decoding based on 
url_part_aliases and common_url_parts, then decompressing the DocHead 
field if it's compressed.

-Geoff

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this.

Re: [htdig3-dev] Berkeley DB2 and Perl scripts

Reply via email to