On Jan 20, 2014, at 3:36 PM, Jens Alfke <j...@couchbase.com> wrote:

> 
> On Jan 20, 2014, at 12:19 PM, Stefan Klein <st.fankl...@gmail.com> wrote:
> 
>> a performance impact of random document ids.
>> If the document ids are not sequential larger portions of the b-tree need
>> to be rewriten.
>> Is this related only to inserts or also to updates?
> 
> It only applies to inserts, because if nodes are added to the b-tree in 
> random order, more rebalancing will be necessary. Adding them in sequential 
> order is more optimal.
> 
> Updates don't change the structure of the tree (only the contents of leaf 
> nodes) so their ordering doesn't matter as much.
> 
> —Jens

Well, at the end of the day the goal is that documents which mutated 
concurrently share long common id prefixes, because if they do they'll share 
many of the same inner nodes in their respective paths to the root, and we can 
optimize away extra rewrites of those inner nodes.

The easiest place to achieve this is during insertion by a judicious choice of 
document ID, but if for some reason you have a subset of documents in your 
database which are "hot" (i.e., frequently updated relative to the others) and 
you can afford to update them via _bulk_docs then it would make sense to give 
that document class a common ID prefix so that you can benefit from this group 
commit optimization.

Adam

Reply via email to