[
https://issues.apache.org/jira/browse/FLINK-37567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939250#comment-17939250
]
Gyula Fora commented on FLINK-37567:
------------------------------------
You are right regarding the parallelism / slots but this doesn't always work if
parallelism is overridden programmatically for example.
The intention for keeping the JM around is to allow users to observe the job,
checkpoint state etc if they want. There is a timeout after which it's
terminated. You can set this timeout to 0 to not keep it around. The JM
generally doesn't consume too much resources so it is usually fine.
I can see the value of a standalone model but a huge downside is that it cannot
easily integrate with active resource management for rescaling, and adaptive
job / tm scheduling. For most users there is also no need for such advanced
customization, even in custom kubernetes envs based on what I have seen in the
last 3 years almost everyone still uses the native integration.
> Flink clusters not be clean up when using job cancellation as suspend
> mechanism
> -------------------------------------------------------------------------------
>
> Key: FLINK-37567
> URL: https://issues.apache.org/jira/browse/FLINK-37567
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Affects Versions: kubernetes-operator-1.10.0, kubernetes-operator-1.11.0
> Reporter: Alan Zhang
> Priority: Major
>
> In general, for application mode, the Flink cluster lifecycle should be tight
> with the Flink job lifecycle, which means we should delete the Flink cluster
> if the job stopped.
> However, I noticed that Flink clusters are not deleted when I tried to
> suspend FlinkDeployment with "job-cancel" enabled. The CR shows the job under
> "CANCELED" state, but the underlying Flink cluster is still running.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)