Hi all,
I took a thread dump to check where the thread is stack when  
CuratorFrameworkImpl.close() -> EnsembleTracker.close() -> Watchers removal...  
is invoked in this specific case (ZK server down). Basically the close()  is 
blocked waiting that watches removal is done in foreground, BUT this can NOT 
happen because the ZK server is down.
In 5.7.1 this was really much faster, so I assume this was maybe done in 
background or in a different manner.
I have seen that there has been some relevant changes in WatcherRemoval 
classes. Could you help me to debug this problem?

"main" #1 prio=5 os_prio=0 cpu=703,45ms elapsed=11,04s tid=0x00007f1144019dd0 
nid=0x229807 waiting on condition  [0x00007f1149dfb000]   
java.lang.Thread.State: TIMED_WAITING (parking) at 
jdk.internal.misc.Unsafe.park(java.base@17.0.14/Native Method) - parking to 
wait for  <0x00000005b4258b38> (a java.util.concurrent.CountDownLatch$Sync) at 
java.util.concurrent.locks.LockSupport.parkNanos(java.base@17.0.14/LockSupport.java:252)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@17.0.14/AbstractQueuedSynchronizer.java:717)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(java.base@17.0.14/AbstractQueuedSynchronizer.java:1074)
 at 
java.util.concurrent.CountDownLatch.await(java.base@17.0.14/CountDownLatch.java:276)
 at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:417)
 at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:80) at 
org.apache.curator.framework.imps.RemoveWatchesBuilderImpl.pathInForeground(RemoveWatchesBuilderImpl.java:232)
 at 
org.apache.curator.framework.imps.RemoveWatchesBuilderImpl.internalRemoval(RemoveWatchesBuilderImpl.java:86)
 at 
org.apache.curator.framework.imps.WatcherRemovalManager.removeWatchers(WatcherRemovalManager.java:59)
 at 
org.apache.curator.framework.imps.WatcherRemovalFacade.removeWatchers(WatcherRemovalFacade.java:54)
 at 
org.apache.curator.framework.imps.EnsembleTracker.close(EnsembleTracker.java:101)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.close(CuratorFrameworkImpl.java:382)
 at 
com.cheva.grantor.CuratorCloseSlow.tesCuratorCloseSlow(CuratorCloseSlow.java:28)

Regards,

EVaristo


    En domingo, 9 de marzo de 2025, 08:31:44 CET, Evaristo José Camarero 
<evaristojo...@yahoo.es> escribió:  
 
 

  Hi all,
Hi took recent 5.8.0 release and some project tests were running really slow 
compared with 5.7.1
I took a closer look and CuratorFramework.close method is really slow when ZK 
server is stop.
I have included a test that makes reproduction easy
I am running Manjaro with OpenJDK 17
When test is running with Curator 5.7.1 closing Curator instance takes 1200 
millisWhen test is running with Curator 5.8.0 closing Curator instance takes 
20000 millis
Looks to me there is something wrong here, BUT wanted to double check with you.
Best regards,
Cheva

package com.cheva.grantor;
import static java.util.concurrent.TimeUnit.SECONDS;import static 
org.junit.jupiter.api.Assertions.assertTrue;
import java.time.Duration;import java.time.Instant;
import org.apache.curator.framework.CuratorFramework;import 
org.apache.curator.framework.CuratorFrameworkFactory;import 
org.apache.curator.retry.RetryOneTime;import 
org.apache.curator.test.BaseClassForTests;import org.junit.jupiter.api.Test;
class CuratorCloseSlow extends BaseClassForTests {
  @Test  void tesCuratorCloseSlow() throws Exception {    Instant t0;    try 
(CuratorFramework cf =        
CuratorFrameworkFactory.newClient(server.getConnectString(), new 
RetryOneTime(1_000))) {      cf.start();      
assertTrue(cf.blockUntilConnected(2, SECONDS));      
cf.create().forPath("/jejeje");      server.stop();      Thread.sleep(100L);    
  t0 = Instant.now();    }    Instant t1 = Instant.now();    long 
closeDurationMillis = Duration.between(t0, t1).toMillis();    
System.out.println("Close Duration took " + closeDurationMillis + " secs");    
assertTrue(closeDurationMillis < 2_000L);  }}

  
  

Reply via email to