On Sep 24, 2008, at 9:53 , Ayende Rahien wrote:
On Wed, Sep 24, 2008 at 10:46 AM, Jan Lehnardt <[EMAIL PROTECTED]> wrote:
Anyway, I had a few questions that I hope I'll be able to get some
answers
for.
merge conflicts - how does couch db decides on "best" revision?
It arbitrarily choses one revision. The only guarantee that is made
is that
for
the same conflict all nodes in a CouchDB cluster choose the same
latest
revision to ensure data consistency.
How do you ensure that across a cluster, all nodes will select the
same
version?
Assume that I have the following sequence of events:
- create doc A (v1)
- update doc A from V1 (v2)
- update doc A from v1 (v3) - conflict
- update doc A from v1 on separate machine (v?) - conflict
How does it get resolved?
There are two types of conflicts here. update conflicts and
replication conflicts.
You cannot update doc A from V1 to V3.
- server 1: create doc A(V1)
- replicate server 1 and server 2
(doc A now lives on server 1 and server 2 with V1)
- server 1: update doc A(V1) to doc A(V2a)
- server 2: update doc A(V1) to doc a(V2b)
(now there are two V2 for doc A). No problem so far)
- replicate server 1 and server 2:
- CouchDB sees that V2a and V2b are different and decides
either one to be the latest revision. Say V2a gets chosen.
- Server 1 and server 2 now both have doc A (V2a) as the
latest revision, but doc a is flagged with a _conflict attribute.
- You need to go in and resolve that by wither approving CouchDB's
automatic choice or by using a previous revision. There is no
merging
and there is no auto-conflict-resolution. Only auto-conflict-
detection.
to get from the code so far are:
- How is the data stored? I think that it is a binary tree on
disk, but I
am
not following how updates to that can be safe to do so with ACID
guarantees.
Two questions that are of particular interest to me, and I haven't
been
able
Writes are serialized. Only one write can happen at a time and it is
completely
flushed and committed to disk (2 x fsync()) before another write
comes in.
Writes
are append-only. No data is ever overwritten. This gives us the
ACID & MVCC
buzzcronyms :-)
Can you speak more on the actual file format? I don't think that I
understand how you can have append only with binary trees.
I have to refer you to Damien or the source for that one. :-)
Cheers
Jan
--