[ 
https://issues.apache.org/jira/browse/FLINK-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijiang Wang updated FLINK-5703:
---------------------------------
    Description: 
The ExecutionGraph structure would be recovered from TaskManager reports during 
reconciling period, and the necessary information includes:

    - Execution: ExecutionAttemptID, AttemptNumber, StartTimestamp, 
ExecutionState, SimpleSlot, PartialInputChannelDeploymentDescriptor
    - ExecutionVertex: Map<IntermediateResultPartitionID, 
IntermediateResultPartition>
    - ExecutionGraph: ConcurrentHashMap<ExecutionAttemptID, Execution>

For {{RECONCILING}} ExecutionState, it should be transition into any existing 
task states ({{RUNNING}},{{CANCELED}},{{FAILED}},{{FINISHED}}). To do so, the 
TaskManger should maintain the terminal task state 
({{CANCELED}},{{FAILED}},{{FINISHED}}) for a while and we try to realize this 
mechanism in another jira. In addition, the state transition would trigger 
different actions, and some actions rely on above necessary information. 
Considering this limit, the recovery process will be divided into two steps:

    - First, recovery all other necessary information except ExecutionState.
    - Second, transition ExecutionState into real task state and trigger 
actions. The behavior is the same with {{UpdateTaskExecutorState}}.

To make logic easy and consistency, during recovery period, all the other RPC 
messages ({{UpdateTaskExecutionState}}, {{ScheduleOrUpdateConsumers}},etc) from 
TaskManager should be refused temporarily and responded with a special message 
by JobMaster. Then the TaskManager should retry to send these messages later 
until JobManager ends recovery and acknowledgement.

The {{RECONCILING}} JobStatus, it would be transition into one of the states 
({{RUNNING}},{{FAILING}},{{FINISHED}}) after recovery.

    - {{RECONCILING}} to {{RUNNING}}: All the TaskManager report within 
duration time and all the tasks are in {{RUNNING}} states.
    - {{RECONCILING}} to {{FAILING}}: One of the TaskManager does not report in 
time, or one of the tasks state is in {{FAILED}} or {{CANCELED}}
    - {{RECONCILING}} to {{FINISHED}}: All the TaskManger report within 
duration time and all the tasks are in {{FINISHED}} states.

> ExecutionGraph recovery based on reconciliation with TaskManager reports
> ------------------------------------------------------------------------
>
>                 Key: FLINK-5703
>                 URL: https://issues.apache.org/jira/browse/FLINK-5703
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Distributed Coordination, JobManager
>            Reporter: Zhijiang Wang
>            Assignee: Zhijiang Wang
>
> The ExecutionGraph structure would be recovered from TaskManager reports 
> during reconciling period, and the necessary information includes:
>     - Execution: ExecutionAttemptID, AttemptNumber, StartTimestamp, 
> ExecutionState, SimpleSlot, PartialInputChannelDeploymentDescriptor
>     - ExecutionVertex: Map<IntermediateResultPartitionID, 
> IntermediateResultPartition>
>     - ExecutionGraph: ConcurrentHashMap<ExecutionAttemptID, Execution>
> For {{RECONCILING}} ExecutionState, it should be transition into any existing 
> task states ({{RUNNING}},{{CANCELED}},{{FAILED}},{{FINISHED}}). To do so, the 
> TaskManger should maintain the terminal task state 
> ({{CANCELED}},{{FAILED}},{{FINISHED}}) for a while and we try to realize this 
> mechanism in another jira. In addition, the state transition would trigger 
> different actions, and some actions rely on above necessary information. 
> Considering this limit, the recovery process will be divided into two steps:
>     - First, recovery all other necessary information except ExecutionState.
>     - Second, transition ExecutionState into real task state and trigger 
> actions. The behavior is the same with {{UpdateTaskExecutorState}}.
> To make logic easy and consistency, during recovery period, all the other RPC 
> messages ({{UpdateTaskExecutionState}}, {{ScheduleOrUpdateConsumers}},etc) 
> from TaskManager should be refused temporarily and responded with a special 
> message by JobMaster. Then the TaskManager should retry to send these 
> messages later until JobManager ends recovery and acknowledgement.
> The {{RECONCILING}} JobStatus, it would be transition into one of the states 
> ({{RUNNING}},{{FAILING}},{{FINISHED}}) after recovery.
>     - {{RECONCILING}} to {{RUNNING}}: All the TaskManager report within 
> duration time and all the tasks are in {{RUNNING}} states.
>     - {{RECONCILING}} to {{FAILING}}: One of the TaskManager does not report 
> in time, or one of the tasks state is in {{FAILED}} or {{CANCELED}}
>     - {{RECONCILING}} to {{FINISHED}}: All the TaskManger report within 
> duration time and all the tasks are in {{FINISHED}} states.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to