Brian's decription for reduce function is clearly, but i think you can achieve your goal as below:
function(doc){ emit(doc.group_key, doc) } function(keys,values,rereduce){ //... } with the group=true option you can impl doc process by group_key in reduce function, as your example reduce will be invoke two times with values: [firstdoc,seconddoc] which groupkey=1.... [thirddoc,forthdoc] groupkey=2, is that enough? On Mon, Jun 22, 2009 at 5:07 PM, Brian Candler <b.cand...@pobox.com> wrote: > On Fri, Jun 19, 2009 at 09:43:31AM +0200, Daniel Trümper wrote: > > Hi, > > > > I am somewhat new to CouchDB but have been doing some stuff with it and > > this is my first post to the list so pardon if I am wrong :) > > > > > >> It would be really cool if there were some way to pass all the docs > >> with a value of 1 for group_key to a single map function call, so I > >> could do computation across those related documents and emit the > >> results ... I'm just using the magic group_key attribute as an > >> example, if such a feature were to actually be made I'd think you'd > >> define a javascript function which returned a single groupping k to > >> exist I > > I think this is what the reduce function is for. > > No, I'm afraid it's not. > > The OP wants to calculate information across a group of related documents. > CouchDB does not guarantee that all the related documents will be passed to > the reduce function at the same time. It may pass documents (d1,d2,d3) to > the reduce function to generate Rx, then pass (d4,d5,d6) to the reduce > function to generate Ry, then (d7,d8,d9) to generate Rz, then pass > (Rx,Ry,Rz) to the re-reduce function to generate the final R value. > > If the values sharing the key were e.g. d3,d4 then you won't be able to > process them together, as they would not be presented to the reduce > function > at the same time. > > Using a grouped reduce query is better (i.e. group=true), but a large set > of > documents sharing the same group key are still likely to be split into > several reductions with a re-reduce. The OP was talking about ~100 > documents > sharing this key, and so they may well be split this way. > > Furthermore, CouchDB optimises its reductions by storing the reduced value > for all the documents within the same Btree node. For example, suppose you > have > > +-------------+ +-------------+ +-------------+ > | d1 d2 d3 Rx | | d4 d5 d6 Ry | | d7 d8 d9 Rz | > +-------------+ +-------------+ +-------------+ > > Then you make a reduce query for the key range which includes documents d2 > to d8 inclusive (or a grouped query where d2 to d8 share the same group > key). CouchDB will calculate: > > R1 = Reduce(d2,d3) > R2 = Reduce(d7,d8) > R = Rereduce(R1,Ry,R2) > > That is: the already-reduced value of Ry=Reduce(d4,d5,d6) is reused without > recomputation. So the reduce function doesn't see documents d4 to d6 again. > > So in summary: you cannot rely on the reduce function to be able to process > adjacent documents. You *must* do this sort of processing client-side. > > HTH, > > Brian. > -- Yours sincerely Jack Su