I think it'd make sense to only include the leaf revs. It's annoying that an update to an old document blows away such a big chunk of the Merkel tree (which makes recovery from a mismatch harder), but it does have the advantage over the rolling hash of only requiring extra space in the inner btree nodes in couch_btree. Not sure how easily one could later a Merkle tree on top of other replicating systems (this is replication@, after all).
Adam > On Apr 15, 2014, at 3:57 PM, Chris Anderson <[email protected]> wrote: > > I think compaction preserves old rev ids and sequences, but rev-stemming > could result in a mismatch unless only the leaf revs are hashed in the > merkle reduction. > >> On Tuesday, April 15, 2014, Calvin Metcalf <[email protected]> wrote: >> >> won't compaction make that tricky to calculate retroactively? >> >> >> On Tue, Apr 15, 2014 at 3:10 PM, Chris Anderson >> <[email protected]<javascript:;> >>> wrote: >> >>> If you want to know if checkpoints are the same, maybe a combination of >> the >>> sequence number and a merkle tree of document revision ids would work? It >>> would require adding a reduction to the by sequence tree, but you'd be >> able >>> to know if two sequences also refer to the same content. Eg is the source >>> database the same as you talked to last, or just a new one with the same >>> sequence number. >>> >>> Chris >>> >>> >>> On Tue, Apr 15, 2014 at 11:54 AM, Calvin Metcalf >>> <[email protected]>wrote: >>> >>>> I think the problem is not as much deleting and recreating a database >> but >>>> wiping a virtual machine and restoring from a backup, now you have more >>> or >>>> less gone back in time with the target database and it has different >>> stuff >>>> but the same uuid. >>>> >>>> >>>>> On Tue, Apr 15, 2014 at 2:32 PM, Dale Harvey <[email protected]> >>>> wrote: >>>> >>>>> I dont understand the problem with per db uuids, so the uuid isnt >>>>> multivalued nor is it queried >>>>> >>>>> A is readyonly, B is client, B starts replication from A >>>>> B reads the db uuid from A / itself, generates a replication_id, >>>> stores >>>>> on B >>>>> try to fetch replication checkpoint, if successful we query >> changes >>>> from >>>>> since? >>>>> >>>>> In pouch we store the uuid along with the data, so file based backups >>>> arent >>>>> a problem, seems couchdb could / should do that too >>>>> >>>>> This also fixes the problem mentioned on the mailing list, and one I >>> have >>>>> run into personally where people forward db requests but not server >>>>> requests via a proxy >>>>> >>>>> >>>>> On 15 April 2014 19:18, Calvin Metcalf <[email protected]> >>> wrote: >>>>> >>>>>> except there is no way to calculate that from outside the database >> as >>>>>> changes only ever gives the more recent document version. >>>>>> >>>>>> >>>>>> On Sun, Apr 13, 2014 at 9:47 PM, Calvin Metcalf < >>>>> [email protected] >>>>>>> wrote: >>>>>> >>>>>>> oo didn't think of that, yeah uuids wouldn't hurt, though the >> more >>> I >>>>>> think >>>>>>> about the rolling hashing on revs, the more I like that >>>>>>> >>>>>>> >>>>>>> On Sun, Apr 13, 2014 at 6:00 PM, Adam Kocoloski < >>>>>> [email protected]>wrote: >>>>>>> >>>>>>>> Yes, but then sysadmins have to be very very careful about >>> restoring >>>>>> from >>>>>>>> a file-based backup. We run the risk that {uuid, seq} could be >>>>>>>> multi-valued, which diminishes its value considerably. >>>>>>>> >>>>>>>> I like the UUID in general -- we've added them to our internal >>> shard >>>>>>>> files at Cloudant -- but on their own they're not a bulletproof >>>>> solution >>>>>>>> for read-only incremental replications. >>>>>>>> >>>>>>>> Adam >>>>>>>> >>>>>>>>> On Apr 13, 2014, at 5:16 PM, Calvin Metcalf < >>>>> [email protected] >>>>>>> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I mean if your going to add new features to couch you could >> just >>>>> have >>>>>>>> the >>>>>>>>> db generate a random uuid on creation that would be different >> if >>>> it >>>>>> was >>>>>>>>> deleted and recreated >> -- >> -Calvin W. Metcalf > > > -- > — > Chris Anderson @jchris > http://www.couchbase.com
