[ https://issues.apache.org/jira/browse/FLINK-21942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann updated FLINK-21942: ---------------------------------- Affects Version/s: 1.12.2 > KubernetesLeaderRetrievalDriver not closed after terminated which lead to > connection leak > ----------------------------------------------------------------------------------------- > > Key: FLINK-21942 > URL: https://issues.apache.org/jira/browse/FLINK-21942 > Project: Flink > Issue Type: Bug > Affects Versions: 1.12.2, 1.13.0 > Reporter: Yi Tang > Priority: Major > Attachments: image-2021-03-24-18-08-30-196.png, > image-2021-03-24-18-08-42-116.png, jstack.l > > > Looks like KubernetesLeaderRetrievalDriver is not closed even if the > KubernetesLeaderElectionDriver is closed and job reach globally terminated. > This will lead to many configmap watching be still active with connections to > K8s. > When the connections exceeds max concurrent requests, those new configmap > watching can not be started. Finally leads to all new jobs submitted timeout. > [~fly_in_gis] [~trohrmann] This may be related to FLINK-20695, could you > confirm this issue? > But when many jobs are running in same session cluster, the config map > watching is required to be active. Maybe we should merge all config maps > watching? -- This message was sent by Atlassian Jira (v8.3.4#803005)