I vote against this patch for the following reasons:

1. My reading of the Bayou research has shown me that transactions can work with replication. The nature of a transaction is an interesting issue, but orthogonal to this argument.

2. It's not clear on the real performance hit in a partitioned database. Furthermore, the hit may be highly dependent on the configuration and details of the write e.g. application design may result in transactions always going to one shard.

3. In any case, I argue that it should be up the user whether to tradeoff a possible performance hit for a ACID semantics.

4. I'm not convinced of the utility of the two models proposed as replacements for the bulk operations. IMO it would be better to not have a bulk operation than to have the proposed models.

4. The justification is dependent on the implementation details of a future feature that isn't itself described or known. From a procedural point of view therefore it's not possible to assess this argument because the community has no way of assessing it's validity.

5. This argument is also dependent on another argument that CouchDB must provide a single API over both single-node and multi-node operation, and must not allow the user to take advantage of the differences. I disagree with that, but in any case it's not an argument that has been put and resolved by the community.

On 08/02/2009, at 2:17 AM, Damien Katz wrote:

I'm working on a branch that implements couchdb the security features with replication. It not done yet, but anyone is welcome to look at the branch in /branches/rep_security.

In this patch I am attempting to implement new transactions models. The old transaction model has you all or nothing commits for a group of docs, along with conflict checking. If any document was in conflict, the transaction as a whole doesn't save.

The problems with this are:
1. Transactions don't work with replication. Replication doesn't repeat the bulk single transaction, it just copies the documents individually to the target replica. This means any downstream replica can and will sees inconsistent states until replication fully completes, not "all or nothing" states. With bidirectional replication is even worse, as you can get edit conflicts that must be resolved by an external process, . 2. Transactions don't work in a partitioned database without a huge performance hit (locking + 2 phase commits).

So I propose supporting 2 different transaction models:

This first is to support "All or nothing commits", but without guaranteed conflict checking. So you can save bunch of documents to the database and be sure they are all safely stored, or none are safely stored, but you can't be guarantee you don't have any conflicts when you do.

The second is support non-acid bulk transactions, where some document fail and some succeed. If the db crashes in the middle of the transaction, some documents may have made it to disk (completely intact), while others have not. The client will need to check to be sure.

With these 2 transactions models, it's possible to deploy the same apps on a single machine or a huge partitioned cluster. To support the current model, it's only possible to deploy apps on a single machine. I propose we drop the current model as bulk transactions are not supportable in clustered or replicated set ups.

-Damien


Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There is nothing more difficult to plan, more doubtful of success, nor more dangerous to manage than the creation of a new order of things... Whenever his enemies have the ability to attack the innovator, they do so with the passion of partisans, while the others defend him sluggishly, So that the innovator and his party alike are vulnerable.
  -- Niccolo Machiavelli, 1513, The Prince.


Reply via email to