[ 
https://issues.apache.org/jira/browse/CURATOR-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330182#comment-17330182
 ] 

Jordan Zimmerman commented on CURATOR-595:
------------------------------------------

I think the {{acquire(0, ...)}} is the problem. If I get rid of the sleep 
before acquire and change the acquire to: 
{{assertNotNull(semaphore.acquire(sessionTimout * 2, TimeUnit.SECONDS));}} it 
works every time

> InterProcessSemaphoreV2 LOST isn't releasing permits for other clients
> ----------------------------------------------------------------------
>
>                 Key: CURATOR-595
>                 URL: https://issues.apache.org/jira/browse/CURATOR-595
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Recipes
>    Affects Versions: 5.1.0
>            Reporter: Francesco Nigro
>            Assignee: Jordan Zimmerman
>            Priority: Major
>
> I'm not sure this is the right place to raise this, but I've added this test 
> on TestInterProcessSemaphore:
> {code:java}
>     @Test
>     public void testAcquireAfterLostServerOnRestart() throws Exception {
>         final int sessionTimout = 4000;
>         final int connectionTimout = 2000;
>         try (CuratorFramework client = 
> CuratorFrameworkFactory.newClient(server.getConnectString(), sessionTimout, 
> connectionTimout, new RetryNTimes(0, 1))) {
>             client.start();
>             client.blockUntilConnected();
>             final InterProcessSemaphoreV2 semaphore = new 
> InterProcessSemaphoreV2(client, "/1", 1);
>             assertNotNull(semaphore.acquire());
>             CountDownLatch lost = new CountDownLatch(1);
>             client.getConnectionStateListenable().addListener((client1, 
> newState) -> {
>                 if (newState == ConnectionState.LOST) {
>                     lost.countDown();
>                 }
>             });
>             server.stop();
>             lost.await();
>         }
>         server.restart();
>         try (CuratorFramework client = 
> CuratorFrameworkFactory.newClient(server.getConnectString(), sessionTimout, 
> connectionTimout, new RetryNTimes(0, 1))) {
>             client.start();
>             client.blockUntilConnected();
>             final InterProcessSemaphoreV2 semaphore = new 
> InterProcessSemaphoreV2(client, "/1", 1);
>             final int serverTick = ZooKeeperServer.DEFAULT_TICK_TIME;
>             Thread.sleep(sessionTimout + serverTick);
>             assertNotNull(semaphore.acquire(0, TimeUnit.SECONDS));
>         }
>     }
> {code}
> And this is not passing: the doc of InterProcessSemaphoreV2 state that 
> bq. "However, if the client session drops (crash, etc.), any leases held by 
> the client are automatically closed and made available to other clients." 
> maybe I'm missing something obvious on the ZK server config instead.
> Just checked out that by running on separated processes the same test:
> # start server on process A
> # start lease acquire on process B, listening for LOST events before suicide
> # restart server on Process A cause process B to suicide (as expected)
> # start lease acquire on process C, now succeed
> It seems that there is something going on in the intra-process case that's 
> not working as expected (to me, at least).
> NOTE: as written in newer comments, raising the timeout doesn't seems to work 
> too and different boxes are getting different outcomes (making this an 
> intermittent failure).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to