Proposal for changes in view server/protocol

Mikeal Rogers Mon, 26 Jul 2010 14:35:36 -0700

After some conversations I've had in NYC this week and Mathias' great post
on the 10 biggest issues with CouchDB (
http://www.paperplanes.de/2010/7/26/10_annoying_things_about_couchdb.html )
I wanted to formally propose some changes to the view server/protocol.


The first issue I want to tackle is the lack of CommonJS modules in
map/reduce. The reason for this is that we use a deterministic hash on all
the views in a design document in order to query it.

First off, it would be great if we could separate out each view and cache it
based on it's own hash. This way updating one view doesn't blow away the
entire design document. This has some large ramification, for one thing it
means that each view needs to keep it's own last sequence and while one view
is getting up to date it can't be included in generation when other views
are getting updated.

Once each view has it's own deterministic hash I would propose that we move
the responsibility for generating the has to a new view server call. This
call would get triggered during every design doc update and look something
like.

request : ["hash", {"_id":"_design/foo", .......} ]
response ["views/bar","aoivniuasdf8ashd7zh87vxxz87gf8sd7"]

The view server can inspect each map/reduce function and determine which
modules it imports and include those strings in the hash for that particular
view.

The second issue I'd like to tackle is two fold, parallelized view
generation and unnecessarily chatty IO for large view generations.

Currently, every single document is passed to the view server one at a time
and the response is read back one at a time. I would suggest that we allow a
user configuration upper limit to "batch" documents to the view server (100
by default). The request/response would remain exactly the same as it is now
except there would be an extra array around the request and response.

This would also open up the ability for the view server to break up that
batch and pass it to different view servers and then return the responses
all together (this obviously means it's limited to the speed of the client
handling that last chunk).

Thoughts?

Somewhere on github I actually have the changes to the view server for that
batching but it doesn't includes the changes on the erlang side.

-Mikeal

Proposal for changes in view server/protocol

Reply via email to