Till Rohrmann created FLINK-10500:
-------------------------------------

             Summary: Let ExecutionGraphDriver react to fail signal
                 Key: FLINK-10500
                 URL: https://issues.apache.org/jira/browse/FLINK-10500
             Project: Flink
          Issue Type: Sub-task
          Components: Distributed Coordination
    Affects Versions: 1.7.0
            Reporter: Till Rohrmann
             Fix For: 1.7.0


In order to scale down when there are not enough resources available or if TMs 
died, the {{ExecutionGraphDriver}} needs to learn about a failure. Depending on 
the failure type and the available set of resources, it can then decide to 
scale the job down or simply restart. In the scope of this issue, the 
{{ExecutionGraphDriver}} should simply call into the {{RestartStrategy}}.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to