[ https://issues.apache.org/jira/browse/FLINK-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405484#comment-15405484 ]
Zhijiang Wang commented on FLINK-4256: -------------------------------------- In further improvement, if task c failed, the following downstream tasks like d and e should not be restarted. We already make some works related with it. I am interested in the issue of caching intermediate result, it can solve the problem of restarting upstream tasks of failed one. Is it the PIPELINED_PERSISTENT type in result partition? Wish further plan for it. > Fine-grained recovery > --------------------- > > Key: FLINK-4256 > URL: https://issues.apache.org/jira/browse/FLINK-4256 > Project: Flink > Issue Type: Improvement > Components: JobManager > Affects Versions: 1.1.0 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Fix For: 1.2.0 > > > When a task fails during execution, Flink currently resets the entire > execution graph and triggers complete re-execution from the last completed > checkpoint. This is more expensive than just re-executing the failed tasks. > In many cases, more fine-grained recovery is possible. > The full description and design is in the corresponding FLIP. > https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures -- This message was sent by Atlassian JIRA (v6.3.4#6332)