GitHub user sitalkedia opened a pull request: https://github.com/apache/spark/pull/17297
[SPARK-14649][CORE] DagScheduler should not run duplicate tasks on fe⦠## What changes were proposed in this pull request? When a fetch failure occurs, the DAGScheduler re-launches the previous stage (to re-generate output that was missing), and then re-launches all tasks in the stage with the fetch failure that hadn't completed when the fetch failure occurred (the DAGScheduler re-lanches all of the tasks whose output data is not available -- which is equivalent to the set of tasks that hadn't yet completed). This some times leads to wasteful duplicate task run for the jobs with long running task. To address the issue following changes have been made. 1. When a fetch failure happens, the task set manager ask the dag scheduler to abort all the non-running tasks. However, the running tasks in the task set are not killed. 2. When a task is aborted, the dag scheduler adds the task to the pending task list. 3. In case of resubmission of the stage, the dag scheduler only resubmits the tasks which are in pending stage. ## How was this patch tested? Added new tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sitalkedia/spark avoid_duplicate_tasks_new Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17297.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17297 ---- commit e5429d309801bffb8ddc907fb4800efb6fb1a2fa Author: Sital Kedia <ske...@fb.com> Date: 2016-04-15T23:44:23Z [SPARK-14649][CORE] DagScheduler should not run duplicate tasks on fetch failure ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org