[ https://issues.apache.org/jira/browse/FLINK-21902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305898#comment-17305898 ]
Yang Wang commented on FLINK-21902: ----------------------------------- Thanks for creating this ticket. This is a valid concern and we already have the same analysis here FLINK-21685. I think we could have a dedicated thread pool for the {{Fabric8FlinkKubeClient}}. And the default value 2 is enough. > A deadlock while using K8s HA service > ------------------------------------- > > Key: FLINK-21902 > URL: https://issues.apache.org/jira/browse/FLINK-21902 > Project: Flink > Issue Type: Bug > Reporter: Yi Tang > Priority: Major > > The `KubernetesStateHandleStore` using the same threadPoolExecutor with the > Dispatcher to check `checkAndUpdateConfigMap`, which will lead to a deadlock. > example: > {code:java} > private CompletableFuture<Void> removeJob(JobID jobId, CleanupJobState > cleanupJobState) > { final DispatcherJob job = checkNotNull(runningJobs.remove(jobId)); final > CompletableFuture<Void> jobTerminationFuture = job.closeAsync(); return > jobTerminationFuture.thenRunAsync( () -> cleanUpJobData(jobId, > cleanupJobState.cleanupHAData), ioExecutor); } > {code} > will finally call > {code:java} > public CompletableFuture<Boolean> checkAndUpdateConfigMap( > String configMapName, > Function<KubernetesConfigMap, Optional<KubernetesConfigMap>> function) { > ... > CompletableFuture.supplyAsync(..., kubeClientExecutorService) > ... > } > {code} > And the ioExecutor and kubeClientExecutorService is the same executor. -- This message was sent by Atlassian Jira (v8.3.4#803005)