On Sun, 25 Jul 1999, Sleepycat Software wrote:
> The trick is to insert sorted data. If you split a full Btree
> page, half of the original page is put on one page and half on
> another, giving you a 50% page fill factor. If you insert
> sorted data, the Berkeley DB implementation splits in a special
> way, putting almost all the old data on one page and only a
> single key on the new page. From the Berkeley DB documentation:
I must be missing something here... If you're doing dynamic inserts, it
seems hard to insert them in sorted order. :-)
Fortunately for us, we're inserting in batches (for each new document). So
you're saying that the page fill will be much better if those inserts
occur in sorted order by key. I'm pretty sure this is what you mean, but
I'd like to be clear.
> There are a lot of other algorithms to keep the page-fill factor
> in the Btrees higher, mostly involving page-balancing when keys
...
> logically some number of unrelated leaf pages during an operation.
This makes sense. However, from what you've said, using db_dump and
db_load would balance the tree. It's just that we can't do this easily
when doing dynamic updates. (It's obviously a case of not being able to
have your cake and eat it too.)
I think the technique of db_dump/db_load should satisfy those looking to
"optimize" the databases before use.
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.