On Thu, Feb 26, 2009 at 3:54 PM, Chris Anderson <[email protected]> wrote:
> On Thu, Feb 26, 2009 at 3:31 PM, Paul Davis <[email protected]> 
> wrote:
>>
>> Not sure i follow what you mean by balance issues. Seeing as its a
>> btree it can't be unbalanced because if it were it wouldn't be a
>> btree. But then our definitions of balance might be different. :D
>
> When a B-tree is consistently inserted into, with a new highest (or
> lowest) value, you can end up with a deeper tree than when it is
> randomly inserted into. Imagine the inner-nodes all accumulating on
> the right side of the tree, so in the worst case you can end up with
> one lonely leaf node to the left of the root, and essentially a linked
> list of inner-nodes on the right side (such that cost would approach
> O(N) for inserts and lookups.

Thank you for clarifying my intent with a much better statement. This
is what I meant by "unbalanced"

>
> My benchmarks show that inserts do not slow down as you would expect
> them to in this worst case, so Couch is probably doing something
> non-naive to rebalance the tree even when inserts are always at one
> extreme.

Yes, Pauls response shows that this is the case.

>
>>
>> The only performance difference between reading random and the
>> sequential document ids I can think of would favor the sequential side
>> when you're reading documents in roughly the same order as their ids
>> so we get FS cache hits.
>
> From a practical standpoint, you're probably right about this
> (assuming the tree remains shallow even with inserts consistently at
> one edge of the tree.)
>
> Chris
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>

Reply via email to