On Tue, Apr 7, 2009 at 5:51 AM, Jan Lehnardt <[email protected]> wrote:
> Hi Scott,
>
> thanks for raising your concerns. I share it, but I think Brian and
> Adam are only suggesting an optional addition to the existing
> API which leaves the existing case in place. Much like bulk docs
> now has two modes, PUT can have two modes.
>
> Somthing like
>
> PUT /db/doc?rev=foo&allow_conflicts=true
> {"json":"body"}
If I haven't missed the point, I would favor the _rev being in the body of
the json document
PUT /db/doc?allow_conflicts=true {"_rev":"somerev","json":"body"}
This would allow one to restore a dump without having to process each json
document (removing the _rev attribute) before submitting.
Regards,
Jeff Hinrichs
> I wouldn't be opposed to add this.
>
> Cheers
> Jan
> --
>
>
> On 6 Apr 2009, at 20:40, Scott Shumaker wrote:
>
> Just my $0.02, but I think CouchDB is moving in entirely the wrong
>> direction with conflicts in a misguided attempt to make multi-master
>> replication the 'only' way to do things.
>>
>> Very frequently, you need to attempt to resolve a conflict as soon as
>> it occurs - and you often need user interaction to help you resolve
>> the conflict. Sometimes you may need to just refresh the user to the
>> latest version, other times you may be able to choose one of the
>> versions based on some criteria, sometimes you can automatically merge
>> the two versions, and occasionally you need to ask the user what to
>> do. This just won't work if the process is happening offline, in a
>> background job.
>>
>> This isn't just true of CouchDB, but of other distributed systems like
>> Dynamo (read the paper, they talk about this exact issue. Amazon.com
>> has a "merge shopping carts" screen for this exact reason).
>>
>> Getting rid of conflict handling greatly limits the utility of CouchDB
>> for real-world applications (it will certainly force us to adopt
>> another technology instead). And worse, this is all for the goal of
>> supporting multi-master replication, which really isn't a great
>> technology solution anyway. If you want durability and scalability,
>> CouchDB should really adopt the much more robust multiple write nodes
>> / read nodes system (with quorum and reconciliation) in Dynamo or a
>> few other distributed key/value stores.
>>
>> Scott
>>
>>
>> On Mon, Apr 6, 2009 at 12:40 AM, Brian Candler <[email protected]>
>> wrote:
>>
>>> The following is part thought-experiment, part serious suggestion.
>>>
>>> I propose the following: remove all concurrency control from PUT
>>> operations,
>>> and hence also the 409 response. If you PUT a document where the _rev is
>>> not
>>> the same as a 'head' revision, then a new conflicting version is
>>> inserted.
>>> [1]
>>>
>>> The reasoning is as follows:
>>>
>>> 1. Any application which relies on the 409 PUT conflict behaviour is
>>> not going to work properly in a multi-master replication environment.
>>> That is: it is protected against concurrent changes on the same node,
>>> but not on a different node. This is arbitrary.
>>>
>>> 2. The same reasoning was used for getting rid of bulk non-conflicting
>>> updates. Paraphrasing: "a grown-up CouchDB app which runs on a
>>> replicated
>>> cluster won't be able to rely on these semantics, so removing this
>>> capability will encourage you to write your app in a more scalable way.
>>> You will thank us later."
>>>
>>> 3. A CouchDB app should be written so that it "treats edit conflicts as a
>>> common state, not an exceptional one" [2]
>>>
>>> This change will slightly increase the number of these normal conflicts,
>>> whilst forcing the app writer to deal with them.
>>>
>>> 4. By increasing the number of conflicting versions, it is likely to
>>> exercise more the underlying code and flush out bugs (for example, more
>>> fully testing what happens in views when multiple conflicting versions
>>> of
>>> a document are updated or removed)
>>>
>>> 5. It may highlight more clearly where API improvements are needed to
>>> help
>>> applications deal with and resolve conflicts. For example:
>>>
>>> - making it easier for applications to be aware of the existence of
>>> conflicts (Maybe a GET without _rev should fail if there are multiple
>>> conflicting revs, or return all of the versions)
>>>
>>> - given that multiple concurrent clients will see conflicts, and may
>>> attempt to resolve them at the same time, then it's likely that two
>>> clients will independently submit exactly the same document content
>>> after running the conflict-resolution algorithm. It could be helpful
>>> if these were treated as a single new rev, and not two new conflicts.
>>>
>>> Comments? I would be especially interested in hearing from core
>>> developers
>>> who didn't want bulk non-conflicting updates, but *do* want to retain
>>> single
>>> non-conflicting updates, as to why this is logical.
>>>
>>> Regards,
>>>
>>> Brian.
>>>
>>> [1] You can get this behaviour on 0.9.0 by POSTing to _bulk_docs with
>>> {"all_or_nothing":true}
>>>
>>> [2] http://couchdb.apache.org/docs/overview.html under heading
>>> "Conflicts"
>>>
>>>
>>
>