On Thu, Feb 26, 2009 at 3:54 PM, Chris Anderson <[email protected]> wrote: > On Thu, Feb 26, 2009 at 3:31 PM, Paul Davis <[email protected]> > wrote: >> >> Not sure i follow what you mean by balance issues. Seeing as its a >> btree it can't be unbalanced because if it were it wouldn't be a >> btree. But then our definitions of balance might be different. :D > > When a B-tree is consistently inserted into, with a new highest (or > lowest) value, you can end up with a deeper tree than when it is > randomly inserted into. Imagine the inner-nodes all accumulating on > the right side of the tree, so in the worst case you can end up with > one lonely leaf node to the left of the root, and essentially a linked > list of inner-nodes on the right side (such that cost would approach > O(N) for inserts and lookups.
Thank you for clarifying my intent with a much better statement. This is what I meant by "unbalanced" > > My benchmarks show that inserts do not slow down as you would expect > them to in this worst case, so Couch is probably doing something > non-naive to rebalance the tree even when inserts are always at one > extreme. Yes, Pauls response shows that this is the case. > >> >> The only performance difference between reading random and the >> sequential document ids I can think of would favor the sequential side >> when you're reading documents in roughly the same order as their ids >> so we get FS cache hits. > > From a practical standpoint, you're probably right about this > (assuming the tree remains shallow even with inserts consistently at > one edge of the tree.) > > Chris > > -- > Chris Anderson > http://jchris.mfdz.com >
