On Oct 30, 2009, at 4:33 AM, Brian Candler wrote:
On Thu, Oct 29, 2009 at 01:51:57PM -0400, Damien Katz wrote:
Is this a sensible API? You decide. I've given my opinion
previously.
This api seems weird, but it's the closest thing we can have to
multi-
document transactions in CouchDB and be a distributed, partitioned
database. This is because it's pretty much impossible to support all-
or-nothing conflict checking transactions with partitioned database
without some sort of double-lock checking, which is slow and
expensive.
I don't want to prevent conflicts, nor do I want transactions. As
you say,
introducing conflicting revisions is a fact of life in a distributed-
master
system.
However, I believe that CouchDB's API actively discourages people from
writing apps which deal with conflicts properly, by (a) hiding them,
and (b)
making resolve-on-read a multi-step process (e.g. readA, readB, readC,
writeA, deleteB, deleteC) which itself is race-prone and may lead to
more
conflicts and odd intermediate states (*)
This is true if the conflicts are being resolved on more than one
node. You can't avoid this.
What I would like to see is the following.
1. When you request document X, you get *all* conflicting revisions
in one
go. That is, they are treated as equal peers; none is promoted to
winner.
(However, the list can be sorted in a deterministic order, so you
could
get the current behaviour by just picking the first revision from
the
list)
2. When you perform this request, you get a single "context" tag
which identifies this particular *set* of revisions.
3. When you write back the new document, you supply the context tag,
and
this simultaneously supercedes all the other documents.
Effectively this
would be like the _rev you use today, but it would refer to the set.
It could actually just be an array of _revs, but the user should
treat
it as an opaque tag.
4. Views get to see the whole set of revisions too. Again, if they
want
today's behaviour they can just use docs[0] and ignore the others;
but
if they want to resolve conflicts they can too.
5. If two clients replace a document or set of conflicts with a new
document, and the new documents are identical, then they are not
treated as conflicts.
When reading papers on systems like Dynamo, they all seem to have
properties
(1)-(3). That is: it's treated as natural that conflicts should
arise; that
these are fully exposed to the client; and the client is given the
opportunity to resolve them in a single step.
That's a matter of opinion. It sounds more difficult form a client
perspective to me to have to deal with conflicts on every read
operation.
If you want an easier API for saving documents into a conflicted
state
(something like ?conflict=ok), that would be a fairly easy patch to
make. But I'm not sure why users would want that for a single
document.
I think that ultimately the 409 behaviour could be dropped if
conflicts were
handled as above, but that's not my number one concern.
My concern is this:
* Someone writes an application
* They use the "obvious" API: i.e. simple GET and PUT for reading and
updating documents. They code to the 409 for avoiding conflicts. It
all
works fine and they are delighted with couchdb.
* They switch to multi-master
* All hell breaks lose. Users see their docs vanishing. Application
writer
finally works out how to do conflict management properly, and has to
rewrite the app entirely so that (for example) one GET becomes a
GET with ?conflicts=true, followed by multiple GETs for the
additional
versions, followed by conflict resolution followed by a POST
to _bulk_docs to replace the original document and conflicts.
* Application writer curses couchdb, and curses the person who wrote
"Most applications require no special planning to take advantage of
distributed updates and replication".
It sounds like the dev didn't read the documentation.
Yes, I know patches are welcome. The reason I'm not contributing
code for
this right now is that I have higher priorities - I'm happy to keep
my app
409-tied while I work on other things. But at the back of my mind, I
know
that I won't be going multi-master for a long time, if ever.
Patches are welcome, and most everything you propose could be done in
front end that's not that involved.
Regards,
Brian.
(*) Yes, I know that *with care* you can do the writes and deletes
together
as a single _bulk_docs operation, and even bind them together using
"all_or_nothing":true. But this is not obvious. And there are still
races.
For example, I'm not sure that you can use a multi-key fetch for
getting all
the conflicting revisions in one hit, so you have a series of GETs,
and you
may find that the revs you're GETting have vanished by the time you
read
them.