Should the entire cluster be restarted if a single Task Manager crashes?

Kumar Bolar, Harshith Fri, 18 Jan 2019 01:53:17 -0800

Hi all,

We're running a standalone Flink cluster with 2 Job Managers and 3 Task 
Managers. Whenever a TM crashes, we simply restart that particular TM and 
proceed with the processing.


But reading the comments on 
this<https://stackoverflow.com/questions/54149134/what-happen-to-state-in-flink-task-manager-when-crash>
 question makes it look like we need to restart all the 5 nodes that form a 
cluster to deal with the failure of a single TM. Am I reading this right? What 
would be the consequences if we restart just the crashed TM and let the healthy 
ones run as is?

Thanks,
Harshith

Should the entire cluster be restarted if a single Task Manager crashes?

Reply via email to