On Mon, Dec 7, 2009 at 2:34 AM, Brian Candler <[email protected]> wrote: > I am thinking about storing some derived data which is associated with key > ranges of a view. (Example: an image which provides a graphical summary of a > key range). > > I would like to determine when it's time to regenerate an image, that is, > when the underlying view has changed within that range. > > One thought I had was if I could make a reduce function which was some sort > of checksum of the key/value pairs. Then I could just do a reduce query > across the key range, and see if the reduce value has changed. It would be > like an etag for the range. > > Unfortunately, I can't just do something simple like an md5sum across the > range, because couchdb implements a tree of reduces and re-reduces, and may > decide to restructure this tree. I'd like a checksum which is invariant > across all possible reduce trees for the same data. > > Something simple would be to XOR all the keys and values together, but > sometimes this would not detect changes which happen to XOR to the same > data. > > Perhaps I should md5 each (key,value) pair, and then XOR all those together > in the reduce function. > > Since my docs have updated timestamps, maybe I should just take the max() of > the updated timestamp for each doc, together with a count of the docs (so as > to be able to detect deletions)
This is a great question. If there's a generic way to do this, and it is cheap enough, it could be generalized to handle view etags. Your row count + max timestamp trick seems sensible to me, but obviously is not generalizable. Presumably you could avoid hashing the keys and values by leaning on the document._rev. However, that just pushes the problem back a step. What we need for a general solution is a commutative and associated checksum function, which would be a funny beast indeed. Chris > > I just wondered if anyone had already made an elegant solution for this? Or > some completely different way of determining whether a view has changed > between a given startkey and endkey? > > Thanks, > > Brian. > -- Chris Anderson http://jchrisa.net http://couch.io
