Dmitri Perelman created ZOOKEEPER-1486:
------------------------------------------
Summary: A couple of bugs in the tutorial code
Key: ZOOKEEPER-1486
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1486
Project: ZooKeeper
Issue Type: Bug
Components: documentation
Reporter: Dmitri Perelman
Priority: Minor
Hi,
There are two problems with the barrier example code in the tutorial:
1) A znode created by a process in the function enter() is created with
SEQUENTIAL suffix, however, the name of a znode deleted in the function leave()
doesn't have this suffix. Actually, the leave() function tries to delete a
nonexistent node => a KeeperException is thrown, which is caught silently =>
the process terminates without waiting for the barrier.
2) It seems that the very idea of leaving the barrier by deleting ephemeral
nodes is problematic. Consider the following scenario: there are two clients:
C1 and C2.
- C1 enters the barrier, creates a znode /b1/C1, checks that it's alone and
starts waiting for the second client to come.
- C2 enters the barrier and creates a znode /b1/C2 - the notification to C1 is
sent but still not delivered.
- C2 observes that there are enough children to /b1, enters the barrier,
executes its own operations and invokes leave() procedure.
- during the leave() procedure C2 removes its znode /b1/C2 and exits.
- when the notification about C2's arrival finally arrives to C1, C1 checks
the children of /b1 and doesn't find C2's znode: C1 is stuck.
The solution to this data race would be to create special znodes for leaving
the barrier, similarly to the way they are created for entering the barrier.
Thanks,
Dima
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira