High Availability on Kubernetes

2021-10-25 Thread Deshpande, Omkar
Hello, We are running flink on Kubernetes(Standalone) in application cluster mode. The job manager is deployed as a deployment. We only deploy one instance/replica of job manager. So, the leader election service is not required. And we have set flink task execution retries to infinite. Do we st

Re: High Availability on Kubernetes

2021-10-25 Thread Xintong Song
Without HA, your job can restore from the latest successful checkpoint only if your jobmanager process / pod has not failed. If the jobmanager failed, the new jobmanager brought up by Kubernetes will not be able to find the latest successful checkpoint without HA. Jobmanager can fail due to not onl