On Jul 26, 2010, at 2:35 PM, Mikeal Rogers wrote: > After some conversations I've had in NYC this week and Mathias' great post > on the 10 biggest issues with CouchDB ( > http://www.paperplanes.de/2010/7/26/10_annoying_things_about_couchdb.html ) > I wanted to formally propose some changes to the view server/protocol. > > The first issue I want to tackle is the lack of CommonJS modules in > map/reduce. The reason for this is that we use a deterministic hash on all > the views in a design document in order to query it. > > First off, it would be great if we could separate out each view and cache it > based on it's own hash. This way updating one view doesn't blow away the > entire design document. This has some large ramification, for one thing it > means that each view needs to keep it's own last sequence and while one view > is getting up to date it can't be included in generation when other views > are getting updated.
-1 on splitting views into multiple indexes within a single ddoc. the performance gains of batching are too great to ignore. if you want to split views, put them in their own ddoc. > > Once each view has it's own deterministic hash I would propose that we move > the responsibility for generating the has to a new view server call. This > call would get triggered during every design doc update and look something > like. > > request : ["hash", {"_id":"_design/foo", .......} ] > response ["views/bar","aoivniuasdf8ashd7zh87vxxz87gf8sd7"] > > The view server can inspect each map/reduce function and determine which > modules it imports and include those strings in the hash for that particular > view. that is fine by me (even great) but we will need to hash those hashes together on couches side to reflect the fact that all of a design docs views are stored in a single index file. > > The second issue I'd like to tackle is two fold, parallelized view > generation and unnecessarily chatty IO for large view generations. > > Currently, every single document is passed to the view server one at a time > and the response is read back one at a time. I would suggest that we allow a > user configuration upper limit to "batch" documents to the view server (100 > by default). The request/response would remain exactly the same as it is now > except there would be an extra array around the request and response. > > This would also open up the ability for the view server to break up that > batch and pass it to different view servers and then return the responses > all together (this obviously means it's limited to the speed of the client > handling that last chunk). > > Thoughts? this is good, we should do it. we should maybe spend a little more time in the design phase thinking about how to similar fix the lingering bugs in the _list code, make externals non-blocking, etc. Chris > > Somewhere on github I actually have the changes to the view server for that > batching but it doesn't includes the changes on the erlang side. > > -Mikeal