> It seems to me that if the keys are huffman encoded before they are
> given to the database (both at insertion time and at lookup time) you'd
> get the best of both worlds:  smaller keys and no need to decode when
> comparing keys.  You wouldn't be able to pack some of the data into the
> key, but I am not sure how exactly that would be beneficial anyway.

I agree this is an interesting question.  The problem is that
B+trees are sorted structures, and the data structure is often
chosen because of locality of reference, i.e., if you look up
record "aaaa", it is known that you're likely to look up record
"aaab" too.

The question that I haven't thought through is if it's possible
to Huffman encode data without destroying the sort order, i.e.,
if I have "aaaa", "aaab" and "aaac", does the Huffman encoded
representation sort in the exact same order?  (Remember, the
sort routine is sorting bytes, not bits.)  If the sort order is
changed, we'll have fundamentally changed how Btrees work, and
in a very bad way.

So, the short answer is that I strongly agree with you iff the
Huffman encoding sorts identically to the original key items.
I don't know enough about Huffman engines to know the answer to
that question.

Regards,
--keith

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Keith Bostic
Sleepycat Software Inc.         [EMAIL PROTECTED]
394 E. Riding Dr.               +1-978-287-4781
Carlisle, MA 01741-1601         http://www.sleepycat.com


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to