On Mon, Mar 01, 2010 at 08:33:53AM -0800, J Chris Anderson wrote:
> 
> On Feb 28, 2010, at 11:01 PM, James Marca wrote:
> 
> > On Mon, Mar 01, 2010 at 10:29:03AM +1300, Blair Nilsson wrote:
> >> It shouldn't be surprising though, the target database may already
> >> have records in it that would change the results, which would be
> >> difficult to detect without running the map on all the data that was
> >> already there. Also it is quite likely that it would take longer to
> >> replicate all the view data then regenerate it. Hell, you may never
> >> use that view on the replicated end so transferring the processed data
> >> is a waste anyway.
> >> 
> > 
> > Okay, but I still think it is a bug.  Aside from specific document
> > conflicts, the rules for views are that identical input equals
> > identical output.  So the documents that replicate successfully from
> > one db to the other should produce identical output
> > from identical view code.  I don't know much about b-trees, but I
> > suspect there are algorithms to merge two b-trees efficiently.
> > If that is true, then if the view is already computed then isn't
> > the laziest response just to copy it over and merge it with the
> > current view, even if you have to somehow caveat the replication
> > conflicts.  
> 
> it wouldn't be wrong to do this, but we certainly don't do it yet... 
> complexity. time. we'll get there.


Yes, I apologize "bug" is the wrong word, "feature request" is what I
meant to say. I wish I could tackle this myself, but my time is no
longer my own these days.

> 
> 
> > 
> > CouchDB seems intelligent enough in the view generation to notice when
> > docs have changed and only compute views on those docs, so why can't
> > similar code get thrown at this?
> > 
> > As to whether or not copying the views is useful or not, I think it is
> > application-specific.  I've got a couple terabytes of data waiting in
> > the pipe to get processed this way, so actually, in my use case,
> > re-running the view is out of the question, and re-using views is the
> > height of efficiency.  And finally, I've only got two views (two
> > design documents) and I'm certainly going to be using them!
> > 
> 
> One thing you can do, is merge the view queries without merging the 
> databases. As long as you have identical view definitions and you can bridge 
> the nodes with something like CouchDB Lounge smartproxy, you should be good.

I just might try that.  Lounge looks like it's getting lots of
developer attention.  All I really want in the short term is to hide
merging the view queries from the client.  In the longer term though
I'd love to physically stick a couchdb server on data collection boxes
in the field, so that collecting data becomes a simple pull
replication.

Regards, 

James Marca

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply via email to