On Wed, Sep 24, 2008 at 11:04 AM, Jan Lehnardt <[EMAIL PROTECTED]> wrote:
> > How do you ensure that across a cluster, all nodes will select the same >> version? >> Assume that I have the following sequence of events: >> - create doc A (v1) >> - update doc A from V1 (v2) >> - update doc A from v1 (v3) - conflict >> - update doc A from v1 on separate machine (v?) - conflict >> How does it get resolved? >> > > There are two types of conflicts here. update conflicts and replication > conflicts. > > You cannot update doc A from V1 to V3. > Can I try to update a document that was already updated? Let us say that I get v1, and update it, and try to save, while at the same time someone else saved. What is going to happen? optimistic concurrency error? save and produce conflict? > > - server 1: create doc A(V1) > - replicate server 1 and server 2 > (doc A now lives on server 1 and server 2 with V1) > - server 1: update doc A(V1) to doc A(V2a) > - server 2: update doc A(V1) to doc a(V2b) > (now there are two V2 for doc A). No problem so far) > - replicate server 1 and server 2: > - CouchDB sees that V2a and V2b are different and decides > either one to be the latest revision. Say V2a gets chosen. > - Server 1 and server 2 now both have doc A (V2a) as the > latest revision, but doc a is flagged with a _conflict attribute. > - You need to go in and resolve that by wither approving CouchDB's > automatic choice or by using a previous revision. There is no merging > and there is no auto-conflict-resolution. Only auto-conflict-detection. > Okay, I see how this works for 2 servers. What happen if we have three? So now we have V2a, v2b, v2c. Server 1 replicate with server 2 (v2a is chosen) Server 3 replicate with server 3 (v2? is chosen) What is going on with server 2? On next replication, it will get whatever was chosen by 1 & 3 ? > >> to get from the code so far are: >>> >>>> - How is the data stored? I think that it is a binary tree on disk, but >>>> I >>>> am >>>> not following how updates to that can be safe to do so with ACID >>>> guarantees. >>>> >>>> Two questions that are of particular interest to me, and I haven't been >>> able >>> >>> Writes are serialized. Only one write can happen at a time and it is >>> completely >>> flushed and committed to disk (2 x fsync()) before another write comes >>> in. >>> Writes >>> are append-only. No data is ever overwritten. This gives us the ACID & >>> MVCC >>> buzzcronyms :-) >>> >>> >> Can you speak more on the actual file format? I don't think that I >> understand how you can have append only with binary trees. >> > > > I have to refer you to Damien or the source for that one. :-) > Trolling the sources now, but it is pretty hard to figure it out.
