Hi, We using flink 1.14.4 with flink kubernetes operator. Sometimes when updating a job, it fails on startup and flink removes all HA metadata and exits the jobmanager.
2022-09-14 14:54:44,534 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Restoring job 00000000000000000000000000000000 from Checkpoint 30829 @ 1663167158684 for 00000000000000000000000000000000 located at s3p://flink-checkpoints/k8s-checkpoint-job-name/00000000000000000000000000000000/chk-30829. 2022-09-14 14:54:44,638 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job 00000000000000000000000000000000 reached terminal state FAILED. org.apache.flink.runtime.client.JobInitializationException: Could not start the JobMaster. Caused by: java.util.concurrent.CompletionException: java.lang.IllegalStateException: There is no operator for the state 4e1d9dde287c33a35e7970cbe64a40fe 2022-09-14 14:54:44,930 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error occurred in the cluster entrypoint. 2022-09-14 14:54:45,020 INFO org.apache.flink.kubernetes.highavailability.KubernetesHaServices [] - Clean up the high availability data for job 00000000000000000000000000000000. 2022-09-14 14:54:45,020 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Shutting KubernetesApplicationClusterEntrypoint down with application status UNKNOWN. Diagnostics Cluster entrypoint has been closed externally.. 2022-09-14 14:54:45,026 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Shutting down rest endpoint. 2022-09-14 14:54:46,122 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Shutting down remote daemon. 2022-09-14 14:54:46,321 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remoting shut down. Kubernetes restarts the pod jobmanager and the new instance, not finding the HA metadata, starts the job from an empty state. Is there some option to prevent jobmanager from exiting after an job FAILED state? ________________________________ "This message contains confidential information/commercial secret. If you are not the intended addressee of this message you may not copy, save, print or forward it to any third party and you are kindly requested to destroy this message and notify the sender thereof by email. Данное сообщение содержит конфиденциальную информацию/информацию, являющуюся коммерческой тайной. Если Вы не являетесь надлежащим адресатом данного сообщения, Вы не вправе копировать, сохранять, печатать или пересылать его каким либо иным лицам. Просьба уничтожить данное сообщение и уведомить об этом отправителя электронным письмом."