Performance Regression for view generation in 0.11
--------------------------------------------------

                 Key: COUCHDB-700
                 URL: https://issues.apache.org/jira/browse/COUCHDB-700
             Project: CouchDB
          Issue Type: Improvement
          Components: Database Core
    Affects Versions: 0.10
         Environment: Ubuntu 9.04: Stock CouchDB 0.10 / Own build of CoucDB 
0.11 branch.
            Reporter: Henrik Thostrup Jensen
         Attachments: couchdb_011_view_speedup.diff

Copied from mail to dev at March 15, 2010:

I have a synthetic benchmark for view generation over 70K documents. In stock 
CouchDB 0.10, the view will be checkpointed about 15-17
times. Around 9 times with the batch_save_size and batch_save_interval set to 
10000. CouchDB 0.11 on the other hand performs a whopping
108/109 checkpoints of the view. Due to shadow B-trees this generates 
significantly larger view files (2-3x much) and more time is spend
writing to disk. Generating the view takes roughly twice as long in 0.11 as it 
does in 0.10.

I've tracked down the problem to the new view generation architecture; 
particularly the small sizes of the work queues defined in 
couch_view_updater.erl. The attached patch decreases the number of checkpoints 
to around 15, and makes view generation slightly faster
than 0.10. It basically increases the size of the write queue. Inserting a 500 
ms sleep in do_writes increased the performance a bit
more, but that is not a nice or right solution.

I suspect the patch is not "the completely right solution (tm)", as a lot 
checkpoints are performed initially whereafter it backs off and
starts to take longer time/revisions between the checkpoints. I suspect that 
the code is just writing repeatedly and as writes start
to take longer time, more revisions are added per checkpoint. Though I am not 
really sure of this.

Still, it is a 2 line patch, and it significantly increases view generation 
performance. I'd very much like to see this in 0.11, to
avoid a rather large performance regression between 0.10 and 0.11. If 0.11 
comes out as it is, we would either have to stick with 0.10 or build our own 
patched 0.11.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to