[ https://issues.apache.org/jira/browse/FLINK-19206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17234322#comment-17234322 ]
Mike Kaplinskiy commented on FLINK-19206: ----------------------------------------- Yep that’s exactly right. Setting an owner reference on the deployment would be a perfect solution, but as you said this probably only makes sense for application mode clusters. > Add an ability to set ownerReference manually in Kubernetes > ----------------------------------------------------------- > > Key: FLINK-19206 > URL: https://issues.apache.org/jira/browse/FLINK-19206 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes > Reporter: Mike Kaplinskiy > Priority: Minor > > The current Kubernetes deployment creates a service that is the > ownerReference of all the sub-objects (the JM & TM deployments & the rest > service). However, something presumably has to start the cluster in the first > place. If you are using a job cluster, that can be something like a > kubernetes Job, a CronJob or a tool like Airflow. Unfortunately any failures > in the Flink job can cause retries from these higher-level primitives, which > can yield a lot of "stale clusters" that aren't GCed. > The proposal here is to add a configuration option to set the ownerReference > of the Flink Service. This way the service (and by proxy, all the cluster > components) get deleted when the "parent" decides - including if the parent > is itself a Kubernetes pod. For reference, Spark does something similar via > {{spark.kubernetes.driver.pod.name}} (documented at > [https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode-executor-pod-garbage-collection]). -- This message was sent by Atlassian Jira (v8.3.4#803005)