Hi Gyula,

Why are you sure that the configuration of 
execution.shutdown-on-application-finish leading to this error? I noticed that 
the default value of this configuration is just "true".

>From my understanding, the completed checkpoint store should only clear its 
>persisted checkpoint information on shutdown when the job status is globally 
>terminated.
Did you ever check the configmap, which used to store the completed checkpoint 
store, that its content has been empty after you just trigger a job manager 
failure?

Best
Yun Tang

________________________________
From: Gyula F?ra <gyf...@apache.org>
Sent: Wednesday, May 11, 2022 3:41
To: dev <dev@flink.apache.org>
Subject: Flink job restarted from empty state when 
execution.shutdown-on-application-finish is enabled

Hi Devs!

I ran into a concerning situation and would like to hear your thoughts on
this.

I am running Flink 1.15 on Kubernetes native mode (using the operator but
that is besides the point here) with Flink Kubernetes HA enabled.

We have enabled
*execution.shutdown-on-application-finish = true*

I noticed that if after the job failed/finished, if I kill the jobmanager
pod (triggering a jobmanager failover), the job would be resubmitted from a
completely empty state (as if starting for the first time).

Has anyone encountered this issue? This makes using this config option
pretty risky.

Thank you!
Gyula

Reply via email to