GitHub user ivoson opened a pull request: https://github.com/apache/spark/pull/20244
[SPARK-23053][CORE] taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status â¦d is the same when calculate taskSerialization and task partitions Change-Id: Ib9839ca552653343d264135c116742effa6feb60 ## What changes were proposed in this pull request? When we run concurrent jobs using the same rdd which is marked to do checkpoint. If one job has finished running the job, and start the process of RDD.doCheckpoint, while another job is submitted, then submitStage and submitMissingTasks will be called. In [submitMissingTasks](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L961), will serialize taskBinaryBytes and calculate task partitions which are both affected by the status of checkpoint, if the former is calculated before doCheckpoint finished, while the latter is calculated after doCheckpoint finished, when run task, rdd.compute will be called, for some rdds with particular partition type such as [MapWithStateRDD](https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/rdd/MapWithStateRDD.scala) who will do partition type cast, will get a ClassCastException because the part params is actually a CheckpointRDDPartition. ## How was this patch tested? the exist uts and also add a test case in DAGScheduerSuite to show the exception case. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ivoson/spark branch-taskpart-mistype Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20244.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20244 ---- commit 0dea573e9e724d591803b73f678e14f94e0af447 Author: huangtengfei <huangtengfei@...> Date: 2018-01-12T02:53:29Z submitMissingTasks should make sure the checkpoint status of stage.rdd is the same when calculate taskSerialization and task partitions Change-Id: Ib9839ca552653343d264135c116742effa6feb60 ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org