Yi Tang created FLINK-21902: ------------------------------- Summary: A deadlock while using K8s HA service Key: FLINK-21902 URL: https://issues.apache.org/jira/browse/FLINK-21902 Project: Flink Issue Type: Bug Reporter: Yi Tang
The `KubernetesStateHandleStore` using the same threadPoolExecutor with the Dispatcher to check `checkAndUpdateConfigMap`, which will lead to a deadlock. example: {code:java} private CompletableFuture<Void> removeJob(JobID jobId, CleanupJobState cleanupJobState) { final DispatcherJob job = checkNotNull(runningJobs.remove(jobId)); final CompletableFuture<Void> jobTerminationFuture = job.closeAsync(); return jobTerminationFuture.thenRunAsync( () -> cleanUpJobData(jobId, cleanupJobState.cleanupHAData), ioExecutor); } {code} will finally call {code:java} public CompletableFuture<Boolean> checkAndUpdateConfigMap( String configMapName, Function<KubernetesConfigMap, Optional<KubernetesConfigMap>> function) { ... CompletableFuture.supplyAsync(..., kubeClientExecutorService) ... } {code} And the ioExecutor and kubeClientExecutorService is the same executor. -- This message was sent by Atlassian Jira (v8.3.4#803005)