[
https://issues.apache.org/jira/browse/CURATOR-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713621#comment-14713621
]
J D commented on CURATOR-233:
-----------------------------
Hi Mike,
I can confirm that the code works for 2 nodes.
However, I think the two lines marked below should be in the else statement.
{code:title=DistributedDoubleBarrier.java|borderStyle=solid}
String watchPath; // Watch somebody else that still exists
if ( ourIndex == 0 )
{
watchPath = ZKPaths.makePath(barrierPath,
children.get(children.size() - 1));
}
else
{
watchPath = ZKPaths.makePath(barrierPath, children.get(0));
checkDeleteOurPath(ourNodeShouldExist); //here
ourNodeShouldExist = false; //here
}
Stat stat =
client.checkExists().usingWatcher(watcher).forPath(watchPath);
checkDeleteOurPath(ourNodeShouldExist); //not here
ourNodeShouldExist = false; //not here
{code}
As you guessed correctly, the fix changes the behavior for 3+ nodes. The reason
is that a shortcut for the exit barrier was used which is not compatible to
client 0 leaving prematurely
(http://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_doubleBarriers).
Client 0 watches any other node (and leaves the barrier if only he is left)
All other clients watch client 0 (and leave the barrier if client 0 has left)
Thus, if any other client than client 0 leaves after maxWaitMs, nothing happens
and all remaining clients keep waiting
But if client 0 leaves after maxWaitMs all other nodes leave together with
client 0 (even if they do not have a maxWaitMs time limit)
Best regards,
J D
> Bug in double barrier
> ---------------------
>
> Key: CURATOR-233
> URL: https://issues.apache.org/jira/browse/CURATOR-233
> Project: Apache Curator
> Issue Type: Bug
> Components: Recipes
> Affects Versions: 2.8.0
> Reporter: J D
> Assignee: Mike Drob
> Fix For: 2.9.0
>
> Attachments: DoubleBarrierClient.java, DoubleBarrierTester.java
>
>
> Hi,
> I think I discovered a bug in the internalLeave method of the double barrier
> implementation.
> When a client is told to leave the barrier after maxWait it does not do so. A
> flag is set but the client does not leave the barrier, instead it keeps
> iterating through the control loop and drives CPU usage to 100%.
> I have attached an example.
> Best regards
> Lianro
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)