On Jul 26, 2010, at 2:35 PM, Mikeal Rogers wrote:

> After some conversations I've had in NYC this week and Mathias' great post
> on the 10 biggest issues with CouchDB (
> http://www.paperplanes.de/2010/7/26/10_annoying_things_about_couchdb.html )
> I wanted to formally propose some changes to the view server/protocol.
> 
> The first issue I want to tackle is the lack of CommonJS modules in
> map/reduce. The reason for this is that we use a deterministic hash on all
> the views in a design document in order to query it.
> 
> First off, it would be great if we could separate out each view and cache it
> based on it's own hash. This way updating one view doesn't blow away the
> entire design document. This has some large ramification, for one thing it
> means that each view needs to keep it's own last sequence and while one view
> is getting up to date it can't be included in generation when other views
> are getting updated.

-1 on splitting views into multiple indexes within a single ddoc. the 
performance gains of batching are too great to ignore.

if you want to split views, put them in their own ddoc.

> 
> Once each view has it's own deterministic hash I would propose that we move
> the responsibility for generating the has to a new view server call. This
> call would get triggered during every design doc update and look something
> like.
> 
> request : ["hash", {"_id":"_design/foo", .......} ]
> response ["views/bar","aoivniuasdf8ashd7zh87vxxz87gf8sd7"]
> 
> The view server can inspect each map/reduce function and determine which
> modules it imports and include those strings in the hash for that particular
> view.

that is fine by me (even great) but we will need to hash those hashes together 
on couches side to reflect the fact that all of a design docs views are stored 
in a single index file.

> 
> The second issue I'd like to tackle is two fold, parallelized view
> generation and unnecessarily chatty IO for large view generations.
> 
> Currently, every single document is passed to the view server one at a time
> and the response is read back one at a time. I would suggest that we allow a
> user configuration upper limit to "batch" documents to the view server (100
> by default). The request/response would remain exactly the same as it is now
> except there would be an extra array around the request and response.
> 
> This would also open up the ability for the view server to break up that
> batch and pass it to different view servers and then return the responses
> all together (this obviously means it's limited to the speed of the client
> handling that last chunk).
> 
> Thoughts?

this is good, we should do it. we should maybe spend a little more time in the 
design phase thinking about how to similar fix the lingering bugs in the _list 
code, make externals non-blocking, etc.

Chris

> 
> Somewhere on github I actually have the changes to the view server for that
> batching but it doesn't includes the changes on the erlang side.
> 
> -Mikeal

Reply via email to