GitHub user mxm opened a pull request:

    https://github.com/apache/flink/pull/640

     [scheduling] implement backtracking of intermediate results

    For batch programs, we currently schedule all tasks which are sources
    and let them kick off the execution of the connected tasks. This
    approach bears some problems when executing large dataflows with many
    branches. With backtracking, we traverse the execution graph
    output-centrically (from the sinks) in a depth-first manner. This
    enables us to use resources differently.
    
    In the course of backtracking, only tasks will be executed that are
    required to supply inputs to the current task. When a job is newly
    submitted, this means that the backtracking will reach the
    sources. When the job has been previously executed and intermediate
    results are available, old ResultPartitions to resume from can be
    requested while backtracking.
    
    Backtracking is disabled by default. It can be enabled by setting the
    ScheduleMode in JobGraph to BACKTRACKING.
    
    CHANGELOG
    - new scheduling mode: backtracking
    - backtracks from the sinks of an ExecutionGraph
    - checks the availability of IntermediatePartitionResults
    - marks ExecutionVertex to be scheduled
    - caches ResultPartitions and reloads them
    - resumes from intermediate results
    - test for general behavior of backtracking (BacktrackingTest)
    - test for resuming from an intermediate result (ResumeITCase)
    - test for releasing of cached ResultPartitions (ResultPartitionManagerTest)
    - allow multiple consumers per blocking intermediate result (batch)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mxm/flink backtracking-scheduling-dev

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/640.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #640
    
----
commit a580c8973ccb3579c79ebd0dc860c1f754eb87fd
Author: Maximilian Michels <m...@apache.org>
Date:   2015-04-29T10:34:31Z

    [FLINK-1843] remove SoftReferences on archived ExecutionGraphs
    
    The previously introduced SoftReferences to store archived
    ExecutionGraphs cleared old graphs in a non-transparent order.

commit b9194c01bdbce25af2bf91c617af2eab94f3353c
Author: Maximilian Michels <m...@apache.org>
Date:   2015-04-29T13:59:11Z

    [JobManager] default to single-user mode, save the last ExecutionGraph for 
resuming

commit 57c12c36b82e866ba0b05b7a82bf0c42a0e8b042
Author: Maximilian Michels <m...@apache.org>
Date:   2015-04-13T17:32:40Z

    [scheduling] implement backtracking of intermediate results
    
    For batch programs, we currently schedule all tasks which are sources
    and let them kick off the execution of the connected tasks. This
    approach bears some problems when executing large dataflows with many
    branches. With backtracking, we traverse the execution graph
    output-centrically (from the sinks) in a depth-first manner. This
    enables us to use resources differently.
    
    In the course of backtracking, only tasks will be executed that are
    required to supply inputs to the current task. When a job is newly
    submitted, this means that the backtracking will reach the
    sources. When the job has been previously executed and intermediate
    results are available, old ResultPartitions to resume from can be
    requested while backtracking.
    
    Backtracking is disabled by default. It can be enabled by setting the
    ScheduleMode in JobGraph to BACKTRACKING.
    
    CHANGELOG
    - new scheduling mode: backtracking
    - backtracks from the sinks of an ExecutionGraph
    - checks the availability of IntermediatePartitionResults
    - marks ExecutionVertex to be scheduled
    - caches ResultPartitions and reloads them
    - resumes from intermediate results
    - test for general behavior of backtracking (BacktrackingTest)
    - test for resuming from an intermediate result (ResumeITCase)
    - test for releasing of cached ResultPartitions (ResultPartitionManagerTest)
    - allow multiple consumers per blocking intermediate result (batch)

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to