GitHub user sitalkedia opened a pull request:

    https://github.com/apache/spark/pull/17297

    [SPARK-14649][CORE] DagScheduler should not run duplicate tasks on fe…

    ## What changes were proposed in this pull request?
    
    When a fetch failure occurs, the DAGScheduler re-launches the previous 
stage (to re-generate output that was missing), and then re-launches all tasks 
in the stage with the fetch failure that hadn't completed when the fetch 
failure occurred (the DAGScheduler re-lanches all of the tasks whose output 
data is not available -- which is equivalent to the set of tasks that hadn't 
yet completed). This some times leads to wasteful duplicate task run for the 
jobs with long running task.
    
    To address the issue following changes have been made.
    
    1. When a fetch failure happens, the task set manager ask the dag scheduler 
to abort all the non-running tasks. However, the running tasks in the task set 
are not killed.
    2. When a task is aborted, the dag scheduler adds the task to the pending 
task list.
    3. In case of resubmission of the stage, the dag scheduler only resubmits 
the tasks which are in pending stage.
    
    
    ## How was this patch tested?
    
    Added new tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sitalkedia/spark avoid_duplicate_tasks_new

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17297.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17297
    
----
commit e5429d309801bffb8ddc907fb4800efb6fb1a2fa
Author: Sital Kedia <ske...@fb.com>
Date:   2016-04-15T23:44:23Z

    [SPARK-14649][CORE] DagScheduler should not run duplicate tasks on fetch 
failure

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to