On Oct 30, 2009, at 4:33 AM, Brian Candler wrote:

On Thu, Oct 29, 2009 at 01:51:57PM -0400, Damien Katz wrote:
Is this a sensible API? You decide. I've given my opinion previously.


This api seems weird, but it's the closest thing we can have to multi-
document transactions in CouchDB and be a distributed, partitioned
database. This is because it's pretty much impossible to support all-
or-nothing conflict checking transactions with partitioned database
without some sort of double-lock checking, which is slow and expensive.

I don't want to prevent conflicts, nor do I want transactions. As you say, introducing conflicting revisions is a fact of life in a distributed- master
system.

However, I believe that CouchDB's API actively discourages people from
writing apps which deal with conflicts properly, by (a) hiding them, and (b)
making resolve-on-read a multi-step process (e.g. readA, readB, readC,
writeA, deleteB, deleteC) which itself is race-prone and may lead to more
conflicts and odd intermediate states (*)

This is true if the conflicts are being resolved on more than one node. You can't avoid this.


What I would like to see is the following.

1. When you request document X, you get *all* conflicting revisions in one go. That is, they are treated as equal peers; none is promoted to winner.

(However, the list can be sorted in a deterministic order, so you could get the current behaviour by just picking the first revision from the
  list)

2. When you perform this request, you get a single "context" tag
  which identifies this particular *set* of revisions.

3. When you write back the new document, you supply the context tag, and this simultaneously supercedes all the other documents. Effectively this
  would be like the _rev you use today, but it would refer to the set.
It could actually just be an array of _revs, but the user should treat
  it as an opaque tag.

4. Views get to see the whole set of revisions too. Again, if they want today's behaviour they can just use docs[0] and ignore the others; but
  if they want to resolve conflicts they can too.

5. If two clients replace a document or set of conflicts with a new
  document, and the new documents are identical, then they are not
  treated as conflicts.

When reading papers on systems like Dynamo, they all seem to have properties (1)-(3). That is: it's treated as natural that conflicts should arise; that
these are fully exposed to the client; and the client is given the
opportunity to resolve them in a single step.

That's a matter of opinion. It sounds more difficult form a client perspective to me to have to deal with conflicts on every read operation.


If you want an easier API for saving documents into a conflicted state
(something like ?conflict=ok), that would be a fairly easy patch to
make. But I'm not sure why users would want that for a single document.

I think that ultimately the 409 behaviour could be dropped if conflicts were
handled as above, but that's not my number one concern.

My concern is this:

* Someone writes an application

* They use the "obvious" API: i.e. simple GET and PUT for reading and
updating documents. They code to the 409 for avoiding conflicts. It all
 works fine and they are delighted with couchdb.

* They switch to multi-master

* All hell breaks lose. Users see their docs vanishing. Application writer
 finally works out how to do conflict management properly, and has to
 rewrite the app entirely so that (for example) one GET becomes a
GET with ?conflicts=true, followed by multiple GETs for the additional
 versions, followed by conflict resolution followed by a POST
 to _bulk_docs to replace the original document and conflicts.

* Application writer curses couchdb, and curses the person who wrote
 "Most applications require no special planning to take advantage of
 distributed updates and replication".

It sounds like the dev didn't read the documentation.


Yes, I know patches are welcome. The reason I'm not contributing code for this right now is that I have higher priorities - I'm happy to keep my app 409-tied while I work on other things. But at the back of my mind, I know
that I won't be going multi-master for a long time, if ever.

Patches are welcome, and most everything you propose could be done in front end that's not that involved.


Regards,

Brian.

(*) Yes, I know that *with care* you can do the writes and deletes together
as a single _bulk_docs operation, and even bind them together using
"all_or_nothing":true. But this is not obvious. And there are still races. For example, I'm not sure that you can use a multi-key fetch for getting all the conflicting revisions in one hit, so you have a series of GETs, and you may find that the revs you're GETting have vanished by the time you read
them.

Reply via email to