Filip Niksic created ZOOKEEPER-3875:
---------------------------------------
Summary: Sequential consistency violation
Key: ZOOKEEPER-3875
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3875
Project: ZooKeeper
Issue Type: Bug
Components: quorum
Affects Versions: 3.5.8
Reporter: Filip Niksic
Attachments: zookeeper-sc-violation.patch
Using a [tool|https://github.com/fniksic/zootester] that I wrote for testing
ZooKeeper, I discovered the following scenario which causes ZooKeeper to
violate sequential consistency.
Initially, start an ensemble with 3 servers called A, B, and C, and initialize
2 znodes called /key0 and /key1 to 0. Stop all servers.
# Start A and B. Stop A and at the same time initiate setting /key1 to 101 on
B. Stop B.
# Start A and B and stop them. In this step it seems that /key1 == 101 is
successfully propagated to A.
# Start A and C. Initiate a conditional write on A: if /key1 == 101, set /key0
to 200. The write seems to be successful. Stop the servers.
# Start A, B, and C. Initiate a conditional write on B: if /key1 == 0, set
/key1 to 301. Surprisingly, the write succeeds. Stop the servers.
Finally, start all servers and read the values of /key0 and /key1 on all
servers. They will be 200 and 301.
Even if we assume that any write can fail, the set of possible values for /key0
and /key1 under sequential consistency consists of (0, 0), (0, 101), (200,
101), and (0, 301). The values (200, 301) should not be possible: if /key0 ==
200, then setting /key1 to 101 must have succeeded. On the other hand, if /key1
== 301, then setting /key1 to 101 must have failed, as this write happens
before reading /key1 == 0.
The cause of this bug is probably related to the cause of ZOOKEEPER-2832, which
was reported 3 years ago and is still open. You will notice that the above
scenario is similar to the scenario reported there. Indeed, my tool randomly
explores similar scenarios with conditional and unconditional writes under
random server crashes, in search for sequential consistency violations.
I have attached a patch with a test that reproduces this bug. The affected
version is 3.5.8. I suspect that 3.6.1 is also affected, but unfortunately, I'm
having trouble compiling that version.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)