XComp commented on code in PR #430:
URL: https://github.com/apache/curator/pull/430#discussion_r969519573


##########
curator-recipes/src/test/java/org/apache/curator/framework/recipes/leader/TestLeaderLatch.java:
##########
@@ -218,6 +218,56 @@ public void testWatchedNodeDeletedOnReconnect() throws 
Exception
         }
     }
 
+    @Test
+    public void 
testLeadershipElectionWhenNodeDisappearsAfterChildrenAreRetrieved() throws 
Exception
+    {
+        final String latchPath = "/foo/bar";
+        final Timing2 timing = new Timing2();
+        try (CuratorFramework client = 
CuratorFrameworkFactory.newClient(server.getConnectString(), timing.session(), 
timing.connection(), new RetryOneTime(1)))
+        {
+            client.start();
+            LeaderLatch latchInitialLeader = new LeaderLatch(client, 
latchPath, "initial-leader");
+            LeaderLatch latchCandidate0 = new LeaderLatch(client, latchPath, 
"candidate-0");
+            LeaderLatch latchCandidate1 = new LeaderLatch(client, latchPath, 
"candidate-1");
+
+            try
+            {
+                latchInitialLeader.start();
+
+                // we want to make sure that the leader gets leadership before 
other instances joining the party
+                waitForALeader(Collections.singletonList(latchInitialLeader), 
new Timing());
+
+                // candidate #0 will wait for the leader to go away - this 
should happen after the child nodes are retrieved by candidate #0
+                latchCandidate0.debugCheckLeaderShipLatch = new 
CountDownLatch(1);
+
+                latchCandidate0.start();
+                timing.sleepABit();

Review Comment:
   tbh, I'm not really happy with the sleep here and in [line 
248](https://github.com/apache/curator/pull/430/files#diff-75966280cab1f9788b771d244e889731ba35c7918d365c070565e070d5b801ebR248)
 because they are a cause for instabilities: The `close` in [line 
251](https://github.com/apache/curator/pull/430/files#diff-75966280cab1f9788b771d244e889731ba35c7918d365c070565e070d5b801ebR251)
 has to happen after the child nodes for `candidate #0` and `candidate #1` are 
created. AFAIU, we cannot ensure that with the sleep calls due to the 
asynchronous nature of the `start` command that is triggered right before each 
sleep.
   I tried to add a `waitForCondition` instead, when coming up with this test, 
initially, that would wait for corresponding child to be created. 
Unfortunately, this resulted in the test blocking forever because (I guess) the 
await on the `latchCandidate0.debugCheckLeaderShipLatch` is executed in the 
main thread which makes any subsequent operation (including the check for 
children nodes) being blocked.
   
   I hoped that somebody else could come up with a better approach here. 
:thinking: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@curator.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to