Zhu Zhu created FLINK-13055:
-------------------------------
Summary: Leverage JM side partition state to improve region
failover experience
Key: FLINK-13055
URL: https://issues.apache.org/jira/browse/FLINK-13055
Project: Flink
Issue Type: Sub-task
Components: Runtime / Coordination
Affects Versions: 1.8.1
Reporter: Zhu Zhu
Assignee: Zhu Zhu
In current region failover process, most of the input result partition states
are unknown. Even though the failure cause is a PartitionException, only one
unhealthy partition can be identified.
The may lead to multiple unsuccessful failovers before all the unhealthy but
needed partitions are identified and their producers are involved in the
failover as well.
Using JM side tracked partition states to help the region failover to identify
unhealthy(missing) partitions earlier can help with this case.
The basic idea is to build RestartPipelinedRegionStrategy with a
ResultPartitionAvailabilityChecker which can query the JM side tracked
partition states.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)