On Feb 26, 2009, at 1:55 PM, Jan Lehnardt wrote:


On 26 Feb 2009, at 19:49, Barry Wark wrote:

or ascending...

As an asside, why is it that sequential document ids would produce a
significant performance boost? I suspect the answer is something
rather fundamental to CouchDB's design, and I'd like to try to grok
it.

b-trees inner-nodes can get cached better if inserts basically always
use the same path.


What he said. It's pretty standard btree stuff, most, if not all the major rdbms have similar issues with primary keys.

Also, he Ids don't need to be sequential (1,2,3,4...), just ordered (1,5,19,22...). And they don't need to sort higher or lower than all the other ids, so long as they are clustered together. The each btree nodes that have to be loaded that isn't in cache is expensive. The more the keys have to be inserted into random places in the btree, the worse the caching behavior. Right now, with the crypto random UUIDs we generate, it's basically the worst case scenario for doc inserts.

-Damien

Reply via email to