Re: Task Manager shutdown causing jobs to fail

2022-03-07 Thread Zhilong Hong
Hi, Puneet: Like Terry says, if you find your job failed unexpectedly, you could check the configuration restart-strategy in your flink-conf.yaml. If the restart strategy is set to be disabled or none, the job will transition to failed once it encounters a failover. The job would also fail itself

Re: Task Manager shutdown causing jobs to fail

2022-03-07 Thread Puneet Duggal
Hi Terry Wang, So adding to above provided context.. whenever task manager goes down, jobs go into failed state and do not restart. Even though there are good enough free slots available on other task manager to get restarted on. Regards, Puneet > On 04-Mar-2022, at 4:54 PM, Terry Wang

Re: Task Manager shutdown causing jobs to fail

2022-03-04 Thread Terry Wang
Hi, Puneet~ AFAIK, that should be expected behavior that jobs on crashed TaskManager restarts. HA means there is no single point risk but Flink job still need to through failover to ensure state and data consistency. You may refer

Task Manager shutdown causing jobs to fail

2022-03-03 Thread Puneet Duggal
Hi, Currently in production, i have HA session mode flink cluster with 3 job managers and multiple task managers with more than enough free task slots. But i have seen multiple times that whenever task manager goes down ( e.g. due to heartbeat issue).. so does all the jobs running on it even