Filip Niksic created ZOOKEEPER-3875:
---------------------------------------

             Summary: Sequential consistency violation
                 Key: ZOOKEEPER-3875
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3875
             Project: ZooKeeper
          Issue Type: Bug
          Components: quorum
    Affects Versions: 3.5.8
            Reporter: Filip Niksic
         Attachments: zookeeper-sc-violation.patch

Using a [tool|https://github.com/fniksic/zootester] that I wrote for testing 
ZooKeeper, I discovered the following scenario which causes ZooKeeper to 
violate sequential consistency.

Initially, start an ensemble with 3 servers called A, B, and C, and initialize 
2 znodes called /key0 and /key1 to 0. Stop all servers.
 # Start A and B. Stop A and at the same time initiate setting /key1 to 101 on 
B. Stop B.
 # Start A and B and stop them. In this step it seems that /key1 == 101 is 
successfully propagated to A.
 # Start A and C. Initiate a conditional write on A: if /key1 == 101, set /key0 
to 200. The write seems to be successful. Stop the servers.
 # Start A, B, and C. Initiate a conditional write on B: if /key1 == 0, set 
/key1 to 301. Surprisingly, the write succeeds. Stop the servers.

Finally, start all servers and read the values of /key0 and /key1 on all 
servers. They will be 200 and 301.

Even if we assume that any write can fail, the set of possible values for /key0 
and /key1 under sequential consistency consists of (0, 0), (0, 101), (200, 
101), and (0, 301). The values (200, 301) should not be possible: if /key0 == 
200, then setting /key1 to 101 must have succeeded. On the other hand, if /key1 
== 301, then setting /key1 to 101 must have failed, as this write happens 
before reading /key1 == 0.

The cause of this bug is probably related to the cause of ZOOKEEPER-2832, which 
was reported 3 years ago and is still open. You will notice that the above 
scenario is similar to the scenario reported there. Indeed, my tool randomly 
explores similar scenarios with conditional and unconditional writes under 
random server crashes, in search for sequential consistency violations.

I have attached a patch with a test that reproduces this bug. The affected 
version is 3.5.8. I suspect that 3.6.1 is also affected, but unfortunately, I'm 
having trouble compiling that version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to