Re: What happens with a document, if a conflict is not resolved?

Damien Katz Fri, 30 Oct 2009 03:47:00 -0700


On Oct 30, 2009, at 4:33 AM, Brian Candler wrote:

On Thu, Oct 29, 2009 at 01:51:57PM -0400, Damien Katz wrote:
Is this a sensible API? You decide. I've given my opinionpreviously.
This api seems weird, but it's the closest thing we can have tomulti-
document transactions in CouchDB and be a distributed, partitioned
database. This is because it's pretty much impossible to support all-
or-nothing conflict checking transactions with partitioned database
without some sort of double-lock checking, which is slow andexpensive.
I don't want to prevent conflicts, nor do I want transactions. Asyou say,introducing conflicting revisions is a fact of life in a distributed-master
system.

However, I believe that CouchDB's API actively discourages people from
writing apps which deal with conflicts properly, by (a) hiding them,and (b)
making resolve-on-read a multi-step process (e.g. readA, readB, readC,
writeA, deleteB, deleteC) which itself is race-prone and may lead tomore
conflicts and odd intermediate states (*)

This is true if the conflicts are being resolved on more than onenode. You can't avoid this.

What I would like to see is the following.
1. When you request document X, you get *all* conflicting revisionsin onego. That is, they are treated as equal peers; none is promoted towinner.
(However, the list can be sorted in a deterministic order, so youcouldget the current behaviour by just picking the first revision fromthe
  list)

2. When you perform this request, you get a single "context" tag
  which identifies this particular *set* of revisions.
3. When you write back the new document, you supply the context tag,andthis simultaneously supercedes all the other documents.Effectively this
  would be like the _rev you use today, but it would refer to the set.
It could actually just be an array of _revs, but the user shouldtreat
  it as an opaque tag.
4. Views get to see the whole set of revisions too. Again, if theywanttoday's behaviour they can just use docs[0] and ignore the others;but
  if they want to resolve conflicts they can too.

5. If two clients replace a document or set of conflicts with a new
  document, and the new documents are identical, then they are not
  treated as conflicts.
When reading papers on systems like Dynamo, they all seem to haveproperties(1)-(3). That is: it's treated as natural that conflicts shouldarise; that
these are fully exposed to the client; and the client is given the
opportunity to resolve them in a single step.

That's a matter of opinion. It sounds more difficult form a clientperspective to me to have to deal with conflicts on every readoperation.

If you want an easier API for saving documents into a conflictedstate
(something like ?conflict=ok), that would be a fairly easy patch to
make. But I'm not sure why users would want that for a singledocument.
I think that ultimately the 409 behaviour could be dropped ifconflicts were
handled as above, but that's not my number one concern.

My concern is this:

* Someone writes an application

* They use the "obvious" API: i.e. simple GET and PUT for reading and
updating documents. They code to the 409 for avoiding conflicts. Itall
 works fine and they are delighted with couchdb.

* They switch to multi-master
* All hell breaks lose. Users see their docs vanishing. Applicationwriter
 finally works out how to do conflict management properly, and has to
 rewrite the app entirely so that (for example) one GET becomes a
GET with ?conflicts=true, followed by multiple GETs for theadditional
 versions, followed by conflict resolution followed by a POST
 to _bulk_docs to replace the original document and conflicts.

* Application writer curses couchdb, and curses the person who wrote
 "Most applications require no special planning to take advantage of
 distributed updates and replication".


It sounds like the dev didn't read the documentation.

Yes, I know patches are welcome. The reason I'm not contributingcode forthis right now is that I have higher priorities - I'm happy to keepmy app409-tied while I work on other things. But at the back of my mind, Iknow
that I won't be going multi-master for a long time, if ever.

Patches are welcome, and most everything you propose could be done infront end that's not that involved.

Regards,

Brian.
(*) Yes, I know that *with care* you can do the writes and deletestogether
as a single _bulk_docs operation, and even bind them together using
"all_or_nothing":true. But this is not obvious. And there are stillraces.For example, I'm not sure that you can use a multi-key fetch forgetting allthe conflicting revisions in one hit, so you have a series of GETs,and youmay find that the revs you're GETting have vanished by the time youread
them.

Re: What happens with a document, if a conflict is not resolved?

Reply via email to