GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21758
[SPARK-24795][CORE] Implement barrier execution mode ## What changes were proposed in this pull request? Propose new APIs and modify job/task scheduling to support barrier execution mode, which requires all tasks in a same barrier stage start at the same time, and retry all tasks in case some tasks fail in the middle. The barrier execution mode is useful for some ML/DL workloads. The proposed API changes include: `RDDBarrier` that marks an RDD as barrier (Spark must launch all the tasks together for the current stage). `BarrierTaskContext` that support global sync of all tasks in a barrier stage, and provide extra `BarrierTaskInfo`s. In DAGScheduler, we retry all tasks of a barrier stage in case some tasks fail in the middle, this is achieved by unregistering map outputs for a shuffleId (for ShuffleMapStage) or clear the finished partitions in an active job (for ResultStage). ## How was this patch tested? Add `RDDBarrierSuite` to ensure we convert RDDs correctly; Add new test cases in `DAGSchedulerSuite` to ensure we do task scheduling correctly; Add new test cases in `SparkContextSuite` to ensure the barrier execution mode actually works (both under local mode and local cluster mode). You can merge this pull request into a Git repository by running: $ git pull https://github.com/jiangxb1987/spark barrier-execution-mode Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21758.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21758 ---- commit c25ec473ff078c071aec513953f56c64e6a228a4 Author: Xingbo Jiang <xingbo.jiang@...> Date: 2018-07-12T17:38:58Z implement barrier execution mode. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org