Hi, my name is Amy Adams and I'm the Berkeley DB Product Manager for
Sleepycat Software.

I wanted to respond to Support Request #950 which you had open with us.

> From: [EMAIL PROTECTED]
> Subject: Re: prefix comparison costs pages [#950]
>
> Sleepycat Software writes:
>> 
>> If you specified the default prefix function as your prefix function, and
>> got different results than specifying no prefix function, I'm interested
>> in tracking that down, that sounds like a problem.
>
>  Sorry for not being clear. Here is what I do :
>
>  1) Leave the defaults (i.e. the default prefix function is on)
>     Get 548 internal pages.
>  2) Assign the compare function (to point to the default compare function)
>     and leave the prefix function to 0 (i.e. no prefix function is used)
>     Get 546 internal pages.

One of our engineers has now reviewed this Support Request, and he replies
as follows:

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
The dataset in question consisted of English words with two-digit
numerical suffixes.  This is exactly the sort of dataset that prefix
compression won't help at all -- the distinguishing bytes between many of
the data are the -last- few.  I have a weird suspicion that he
misinterpreted "prefix compression" -- it is sort of counterintuitive, as
it compresses (or rather ignores) identical -suffixes-, not prefixes.

Anyway, using a similar dataset (the first 20000 words in
/usr/share/dict/words, each inserted 100 times with a different two-digit
suffix, for a total of 2,000,000 keys), I get 26 internal pages with
either the default or with no prefix function; the prefix function doesn't
(and can't) win anything, but it's also no loss.

Plop a few repetitions of the alphabet onto the end of every key, and the
prefix function shines; now, we get 91 internal pages with the function
and a whopping 364 without.  This seems reasonable given the relative
sizes of the truncated and non-truncated keys, and the size differential
between these keys and the ones in the last test.

In short, as far as I can tell, the prefix function is working exactly as
it ought to.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Thank you for your email, and please let us know if you disagree with our
assessment or if we can be of any further assistance.

Regards,
Amy Adams

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Amy Adams                       Berkeley DB Product Manager
Sleepycat Software Inc.         [EMAIL PROTECTED]
394 E. Riding Dr.               http://www.sleepycat.com
Carlisle, MA 01741-1601


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to