On Mar 30, 2016 11:03 AM, "Jan Lehnardt" <j...@apache.org> wrote: > > Heya Michael, > > thanks for taking the time to write this up!
No problem; Thanks for reading! :) > Could the same* be achieved by taking _revs out of the _rev calculation? It wouldn't make a difference unfortunately. It's about different revId algorithms coexisting in the same replication ecosystem. When the underlying calculation isn't the same between the systems; how they differ is mostly moot. > My question would be: how often does the “same doc, but different _revs > history”-scenario happen as opposed to other conflicts? In the case of third party system replication I expect it to happen every time they replicate with each other for any updated doc id that the two systems share. They are using different revid algorithms and so when one system loads its revision ids into the other system they won't match and they'll generate a conflict doc as if it was tje same document with two histories. > I’m thinking, since content conflicts (_digest/_signature mismatch) still > have to be handled outside of CouchDB, and while writing that logic, doing > a content-equivalence check as a first shortcut in a conflict resolution > function, isn’t the added overhead (more _fields, more keeping track of > stuff, more entries in indexes etc) maybe not worth it, if clients have > an easy way of doing their own autoresolve? This is one of those cases where it feels like a lot of people do less work if the server did it for us already. For every document conflict that ever arises some piece of code on some CPU somewhere should do the "is the content actually different" test; and this ideally would only need to be done once. Given its ubiquity of execution, having the server do duplicate content removal at the time it detects a revision conflict seems reasonable. They are fairly uncommon and it's already likely going to do the work of parsing the document anyway so it can store it, so this only adds the step of comparing the calculated md5s between just the currently active branches. The _signatures field isn't required, but it does speed things up if revids don't match (like from a different revid algorithm). In fact Couch could change the revid to its own algorithm to make this even easier if it wants. This way, 1) if a conflict is present, the client would already have reasonable assurances the content is actually different, and 2) when replicating between systems with different revid algos, they won't create a conflict. Mike