On Feb 26, 2009, at 1:55 PM, Jan Lehnardt wrote:
On 26 Feb 2009, at 19:49, Barry Wark wrote:
or ascending...
As an asside, why is it that sequential document ids would produce a
significant performance boost? I suspect the answer is something
rather fundamental to CouchDB's design, and I'd like to try to grok
it.
b-trees inner-nodes can get cached better if inserts basically always
use the same path.
What he said. It's pretty standard btree stuff, most, if not all the
major rdbms have similar issues with primary keys.
Also, he Ids don't need to be sequential (1,2,3,4...), just ordered
(1,5,19,22...). And they don't need to sort higher or lower than all
the other ids, so long as they are clustered together. The each btree
nodes that have to be loaded that isn't in cache is expensive. The
more the keys have to be inserted into random places in the btree, the
worse the caching behavior. Right now, with the crypto random UUIDs we
generate, it's basically the worst case scenario for doc inserts.
-Damien