Re: Transactional _bulk_docs

Antony Blakey Thu, 05 Feb 2009 19:03:27 -0800


On 06/02/2009, at 6:20 AM, Chris Anderson wrote:

Antony, maybe it would help for you to explain just exactly what you
wouldn't be able to do, without the bulk docs API. It will help to
inform people about the technical issue.



My original email included this:

-------------------------------------------------------

For example, I have documents that can be cloned. The cloned documentcontains a reference to the originating document. Then I delete theoriginal document, the clone history needs to be updated to remove thereference to the original document and replace it with an original-deleted history item. There is a business case that requires thisconsistency.

With a transactional API this is easy. Without it, I can't see a wayto maintain consistency in the face of concurrent application accessand/or failure.


-------------------------------------------------------

However, I don't think this is really about a specific example.

The problem is that if you get one side of the relationship writtenand visible, but the other side not, then other concurrent accessorswill see a partially successful update.

One response is "but you'll see this problem during replication", butI think this is making a big assumption about how replication ismanaged/interleaved with local application behaviour.

Replication, and dealing with conflicts, is in no way automatic. Asothers have stated, there is no domain-independent way of resolvingconflicts. Surely if it were possible to build a transactional API ontop of a conflict-based system, then this statement would not be true?

I am deploying CouchDB like a Notes CLIENT. Not as a high-performancedatabase server. Replication is an explicit operation, that haltsnormal activity. For my first delivery, replicas are read-only, soreplication conflict isn't possible, but when I move to a distributedwriters scenario, resolving replication conflicts will involve aspecialized UI, that allows all conflicts to be resolved before normaloperation resumes. Thus the editing application always sees a conflict-free database.

The use-case of someone doing a local operation e.g. submitting a webform, is very different than resolving replication conflicts. Conflictduring a local operation is a matter of application concurrency,whereas conflict during replication is driven by the overall systemmodel. It has different temporal, administrative and UI boundaries.

In short, I think it is a mistake to try and hide the differentcharacteristics of local (even clustered) operations, and replication.You may disagree, but if the system distinguishes between these twofundamentally different things (distinguished by their partition-tolerance), you can code as though every operation leads to conflictif you wish, but I can't take advantage of the difference.

I know that the long-standing vision of Couch doesn't include special
API exceptions for when you are running on a single node. And I'm a
little afraid that the transactional doc commits Antony wants us to
keep, are only a mirage, which would lead to trouble anyway, when
distributed systems are involved.

I don't understand why this needs to be the case. You can dotransactions in distributed systems. Do you have a model that isn'tamenable to a Scalaris treatment? Especially given that we're onlytalking about transactions over a set of processes that are providingan illusion of a single system. Such a cluster already requires somedegree of partion-tolerance, right? And if not, then whatdistinguishes a cluster from a partition-tolerant p2p mesh?


Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The fact that an opinion has been widely held is no evidence whateverthat it is not utterly absurd.

  -- Bertrand Russell

Re: Transactional _bulk_docs

Reply via email to