[ 
https://issues.apache.org/jira/browse/FLINK-28078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556863#comment-17556863
 ] 

Matthias Pohl commented on FLINK-28078:
---------------------------------------

{code}
16:17:07,802 [ForkJoinPool-45-worker-25] INFO  
org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.CuratorFrameworkImpl
 [] - Starting
16:17:07,804 [ForkJoinPool-45-worker-25] INFO  
org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.CuratorFrameworkImpl
 [] - Default schema
16:17:07,814 [ForkJoinPool-45-worker-25-EventThread] INFO  
org.apache.flink.shaded.curator5.org.apache.curator.framework.state.ConnectionStateManager
 [] - State change: CONNECTED
16:17:07,817 [ForkJoinPool-45-worker-25-EventThread] INFO  
org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.EnsembleTracker
 [] - New config event received: {}
16:17:07,824 [Curator-ConnectionStateManager-0] DEBUG 
org.apache.flink.runtime.leaderelection.ZooKeeperMultipleComponentLeaderElectionDriver
 [] - Connected to ZooKeeper quorum. Leader election can start.
16:17:07,824 [Curator-ConnectionStateManager-0] DEBUG 
org.apache.flink.runtime.leaderelection.ZooKeeperMultipleComponentLeaderElectionDriver
 [] - Connected to ZooKeeper quorum. Leader election can start.
16:17:07,826 [ForkJoinPool-45-worker-25-EventThread] INFO  
org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.EnsembleTracker
 [] - New config event received: {}
16:17:07,848 [ForkJoinPool-45-worker-25-EventThread] DEBUG 
org.apache.flink.runtime.leaderelection.ZooKeeperMultipleComponentLeaderElectionDriver
 [] - ZooKeeperMultipleComponentLeaderElectionDriver obtained the leadership.
16:17:07,860 [ForkJoinPool-45-worker-25] INFO  
org.apache.flink.runtime.leaderelection.ZooKeeperMultipleComponentLeaderElectionDriver
 [] - Closing ZooKeeperMultipleComponentLeaderElectionDriver.
{code}

The test itself usually creates three {{ElectionDriver}} instances and removes 
them one by one through a for loop. The logs of the failed test reveal that 
only two out of the three have the quorum connection established (i.e. the log 
message {{Connected to ZooKeeper quorum. Leader election can start.}} is 
printed). The first iteration picks the first instance, checks its leadership 
and closes it. It looks like the second iteration picks the instance for which 
the quorum connection is still not established. The leadership future could 
therefore never be completed which results in the test getting stuck in the 
{{join}} call.

> ZooKeeperMultipleComponentLeaderElectionDriverTest.testLeaderElectionWithMultipleDrivers
>  runs into timeout
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-28078
>                 URL: https://issues.apache.org/jira/browse/FLINK-28078
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.16.0
>            Reporter: Matthias Pohl
>            Assignee: Matthias Pohl
>            Priority: Major
>              Labels: test-stability
>
> [Build 
> #36189|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=36189&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=24c3384f-1bcb-57b3-224f-51bf973bbee8&l=10455]
>  got stuck in 
> {{ZooKeeperMultipleComponentLeaderElectionDriverTest.testLeaderElectionWithMultipleDrivers}}
> {code}
> "ForkJoinPool-45-worker-25" #525 daemon prio=5 os_prio=0 
> tid=0x00007fc74d9e3800 nid=0x62c8 waiting on condition [0x00007fc6ff2f2000]
> May 30 16:36:10    java.lang.Thread.State: WAITING (parking)
> May 30 16:36:10       at sun.misc.Unsafe.park(Native Method)
> May 30 16:36:10       - parking to wait for  <0x00000000c2571b80> (a 
> java.util.concurrent.CompletableFuture$Signaller)
> May 30 16:36:10       at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> May 30 16:36:10       at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> May 30 16:36:10       at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3313)
> May 30 16:36:10       at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> May 30 16:36:10       at 
> java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947)
> May 30 16:36:10       at 
> org.apache.flink.runtime.leaderelection.ZooKeeperMultipleComponentLeaderElectionDriverTest.testLeaderElectionWithMultipleDrivers(ZooKeeperMultipleComponentLeaderElectionDriverTest.java:256)
> May 30 16:36:10       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> May 30 16:36:10       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> May 30 16:36:10       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> May 30 16:36:10       at java.lang.reflect.Method.invoke(Method.java:498)
> [...]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to