Hi, Puneet~

AFAIK, that should be expected behavior that jobs on crashed TaskManager
restarts. HA means there is no single point risk but Flink job still need
to through failover to ensure state and data consistency. You may refer
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/state/task_failure_recovery/
for
more details.

On Fri, Mar 4, 2022 at 2:50 AM Puneet Duggal <puneetduggal1...@gmail.com>
wrote:

> Hi,
>
> Currently in production, i have HA session mode flink cluster with 3 job
> managers and multiple task managers with more than enough free task slots.
> But i have seen multiple times that whenever task manager goes down ( e.g.
> due to heartbeat issue).. so does all the jobs running on it even when
> there are standby task managers availaible with free slots to run them on.
> Has anyone faced this issue?
>
> Regards,
> Puneet



-- 
Best Regards,
Terry Wang

Reply via email to