I vote against this patch for the following reasons:
1. My reading of the Bayou research has shown me that transactions can
work with replication. The nature of a transaction is an interesting
issue, but orthogonal to this argument.
2. It's not clear on the real performance hit in a partitioned
database. Furthermore, the hit may be highly dependent on the
configuration and details of the write e.g. application design may
result in transactions always going to one shard.
3. In any case, I argue that it should be up the user whether to
tradeoff a possible performance hit for a ACID semantics.
4. I'm not convinced of the utility of the two models proposed as
replacements for the bulk operations. IMO it would be better to not
have a bulk operation than to have the proposed models.
4. The justification is dependent on the implementation details of a
future feature that isn't itself described or known. From a procedural
point of view therefore it's not possible to assess this argument
because the community has no way of assessing it's validity.
5. This argument is also dependent on another argument that CouchDB
must provide a single API over both single-node and multi-node
operation, and must not allow the user to take advantage of the
differences. I disagree with that, but in any case it's not an
argument that has been put and resolved by the community.
On 08/02/2009, at 2:17 AM, Damien Katz wrote:
I'm working on a branch that implements couchdb the security
features with replication. It not done yet, but anyone is welcome to
look at the branch in /branches/rep_security.
In this patch I am attempting to implement new transactions models.
The old transaction model has you all or nothing commits for a group
of docs, along with conflict checking. If any document was in
conflict, the transaction as a whole doesn't save.
The problems with this are:
1. Transactions don't work with replication. Replication doesn't
repeat the bulk single transaction, it just copies the documents
individually to the target replica. This means any downstream
replica can and will sees inconsistent states until replication
fully completes, not "all or nothing" states. With bidirectional
replication is even worse, as you can get edit conflicts that must
be resolved by an external process, .
2. Transactions don't work in a partitioned database without a huge
performance hit (locking + 2 phase commits).
So I propose supporting 2 different transaction models:
This first is to support "All or nothing commits", but without
guaranteed conflict checking. So you can save bunch of documents to
the database and be sure they are all safely stored, or none are
safely stored, but you can't be guarantee you don't have any
conflicts when you do.
The second is support non-acid bulk transactions, where some
document fail and some succeed. If the db crashes in the middle of
the transaction, some documents may have made it to disk (completely
intact), while others have not. The client will need to check to be
sure.
With these 2 transactions models, it's possible to deploy the same
apps on a single machine or a huge partitioned cluster. To support
the current model, it's only possible to deploy apps on a single
machine. I propose we drop the current model as bulk transactions
are not supportable in clustered or replicated set ups.
-Damien
Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
There is nothing more difficult to plan, more doubtful of success, nor
more dangerous to manage than the creation of a new order of things...
Whenever his enemies have the ability to attack the innovator, they do
so with the passion of partisans, while the others defend him
sluggishly, So that the innovator and his party alike are vulnerable.
-- Niccolo Machiavelli, 1513, The Prince.