Weijie Guo created FLINK-29769:
----------------------------------
Summary: Further limit the explosion range of failover in hybrid
shuffle mode
Key: FLINK-29769
URL: https://issues.apache.org/jira/browse/FLINK-29769
Project: Flink
Issue Type: Sub-task
Components: Runtime / Coordination
Affects Versions: 1.17.0
Reporter: Weijie Guo
Under the current failover strategy, if a region changes to the failed state,
all its downstream regions must be restarted. For ALL_ EDGE_BLOCKING type jobs,
since they are scheduled stage by state, no additional overhead. However, for
the hybrid shuffle mode, the upstream and downstream can both run at the same
time. If the upstream task fails, we hope that it will not affect the
downstream regions that do not consume it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)