[ https://issues.apache.org/jira/browse/FLINK-20219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-20219: ----------------------------------- Labels: pull-request-available (was: ) > Rethink the HA related ZNodes/ConfigMap clean up for session cluster > -------------------------------------------------------------------- > > Key: FLINK-20219 > URL: https://issues.apache.org/jira/browse/FLINK-20219 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes, Deployment / Scripts, Runtime / > Coordination > Affects Versions: 1.12.0 > Reporter: Yang Wang > Assignee: Yang Wang > Priority: Major > Labels: pull-request-available > > When I am testing the Kubernetes HA service, I realize that ConfigMap clean > up for session cluster(both standalone and native) are not very easy. > * For the native K8s session, we suggest our users to stop it via {{echo > 'stop' | ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=<ClusterID> > -Dexecution.attached=true}}. Currently, it has the same effect with {{kubectl > delete deploy <ClusterID>}}. This will not clean up the leader > ConfigMaps(e.g. ResourceManager, Dispatcher, RestServer, JobManager). Even > though there is no running jobs before stop, we still get some retained > ConfigMaps. So when and how to clean up the retained ConfigMaps? Should the > user do it manually? Or we could provide some utilities in Flink client. > * For the standalone session, I think it is reasonable for the users to do > the HA ConfigMap clean up manually. > > We could use the following command to do the manually clean up. > {{kubectl delete cm > --selector='app=<ClusterID>,configmap-type=high-availability'}} > > Note: This is not a problem for Flink application cluster. Since we could do > the clean up automatically when all the running jobs in the application > reached terminal state(e.g. FAILED, CANCELED, FINISHED) and then destroy the > Flink cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005)