> It's much simpler than that : 8k page in memory, compressed to
> 4k pages on disk. To find a page you page-number * 4k. No holes.
> The idea is to do the following when writing an 8k page:
>
> Write operation
>
> . compress
> . if compressed size <= 4k
> write page-number * 4k
> . if compressed size > 4k
> allocate a new page
> reference to the new page
> write first uncompressed 4k at page-number * 4k
> write last uncompressed 4k at new page-number * 4k
Yes, I guess that works... might as well use 2K pages on disk
so that you get a finer granularity from the compression, I
don't think it makes the problem harder.
You will have to maintain a list of free pages somewhere (when
an in-memory page in the middle of the database changes and moves
because it needs more on-disk pages to be allocated, you'll have
"free" pages in the middle of the on-disk file.)
> I did some tests yesterday. I took a 900Mb db file containing
> a btree that contains entries that look like what we are going to use.
> I took each 4k page and compressed them individually using zlib. The
> result is that only 171 pages out of 230 000 compressed to a size that
> is greater that 2k. This is less than 0.1%.
> In order to make my test valid I used a btree built from randomly
> inserted keys (the one that only compresses to 55% with gzip) instead
> of the result of a db_dump|db_load (that compresses to 80% with gzip).
It's unclear to me that's the worst case -- the unused space on
the pages is probably all 0's, which means it will compress
well. If I had to think of a "worst case", I'd try storing a
set of already zlib compressed binaries in a Btree, with their
paths as the key.
Regardless, those numbers sound good. What was the range of
compression? How many compressed down to 1K?
I'd also like to see numbers on 8K pages -- 8K is the "standard"
out there, and you'll get better compression from the large size.
> Therefore I'm in the worst case and the number of blocs that does
> not compress enough and lead to exception is small. The exception
> handling could therefore be complex and time consuming. I'll try
> an implementation today. And try to find a way not to steal bits from
> the page structure. It would be annoying to store a page reference
> in every page when only 0.1% of them will use it.
It might be worth thinking about the lifetime of the information
you need. You might be able to discard it when it's in memory,
or, at least hold it in a separate area.
We need to start thinking about recovery, too. If the system
crashes when you've only written one on-disk page of a two
on-disk page pair, how do you do recovery to guarantee that no
data is ever lost?
Regards,
--keith
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Keith Bostic
Sleepycat Software Inc. [EMAIL PROTECTED]
394 E. Riding Dr. +1-978-287-4781
Carlisle, MA 01741-1601 http://www.sleepycat.com
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.