[ 
https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Ousterhout updated SPARK-14658:
-----------------------------------
    Fix Version/s: 2.2.0

> when executor lost DagScheduer may submit one stage twice even if the first 
> running taskset for this stage is not finished
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-14658
>                 URL: https://issues.apache.org/jira/browse/SPARK-14658
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 1.6.1, 2.0.0, 2.1.0, 2.2.0
>         Environment: spark1.6.1  hadoop-2.6.0-cdh5.4.2
>            Reporter: yixiaohua
>             Fix For: 2.2.0
>
>
> {code}
> 16/04/14 15:35:22 ERROR DAGSchedulerEventProcessLoop: 
> DAGSchedulerEventProcessLoop failed; shutting down SparkContext
> java.lang.IllegalStateException: more than one active taskSet for stage 57: 
> 57.2,57.1
>         at 
> org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:173)
>         at 
> org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1052)
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
>         at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1214)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1637)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
>         at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> {code}
> First Time:
> {code}
> 16/04/14 15:35:20 INFO DAGScheduler: Resubmitting ShuffleMapStage 57 (run at 
> AccessController.java:-2) because some of its tasks had failed: 5, 8, 9, 12, 
> 13, 16, 17, 18, 19, 23, 26, 27, 28, 29, 30, 31, 40, 42, 43, 48, 49, 50, 51, 
> 52, 53, 55, 56, 57, 59, 60, 61, 67, 70, 71, 84, 85, 86, 87, 98, 99, 100, 101, 
> 108, 109, 110, 111, 112, 113, 114, 115, 126, 127, 134, 136, 137, 146, 147, 
> 150, 151, 154, 155, 158, 159, 162, 163, 164, 165, 166, 167, 170, 171, 172, 
> 173, 174, 175, 176, 177, 178, 179, 180, 181, 188, 189, 190, 191, 198, 199, 
> 204, 206, 207, 208, 218, 219, 222, 223, 230, 231, 236, 238, 239
> 16/04/14 15:35:20 DEBUG DAGScheduler: submitStage(ShuffleMapStage 57)
> 16/04/14 15:35:20 DEBUG DAGScheduler: missing: List()
> 16/04/14 15:35:20 INFO DAGScheduler: Submitting ShuffleMapStage 57 
> (MapPartitionsRDD[7887] at run at AccessController.java:-2), which has no 
> missing parents
> 16/04/14 15:35:20 DEBUG DAGScheduler: submitMissingTasks(ShuffleMapStage 57)
> 16/04/14 15:35:20 INFO DAGScheduler: Submitting 100 missing tasks from 
> ShuffleMapStage 57 (MapPartitionsRDD[7887] at run at AccessController.java:-2)
> 16/04/14 15:35:20 DEBUG DAGScheduler: New pending partitions: Set(206, 177, 
> 127, 98, 48, 27, 23, 163, 238, 188, 159, 28, 109, 59, 9, 176, 126, 207, 174, 
> 43, 170, 208, 158, 108, 29, 8, 204, 154, 223, 173, 219, 190, 111, 61, 40, 
> 136, 115, 86, 57, 155, 55, 230, 222, 180, 172, 151, 101, 18, 166, 56, 137, 
> 87, 52, 171, 71, 42, 167, 198, 67, 17, 236, 165, 13, 5, 53, 178, 99, 70, 49, 
> 218, 147, 164, 114, 85, 60, 31, 179, 150, 19, 100, 50, 175, 146, 134, 113, 
> 84, 51, 30, 199, 26, 16, 191, 162, 112, 12, 239, 231, 189, 181, 110)
> {code}
> Second Time:
> {code}
> 16/04/14 15:35:22 INFO DAGScheduler: Resubmitting ShuffleMapStage 57 (run at 
> AccessController.java:-2) because some of its tasks had failed: 26
> 16/04/14 15:35:22 DEBUG DAGScheduler: submitStage(ShuffleMapStage 57)
> 16/04/14 15:35:22 DEBUG DAGScheduler: missing: List()
> 16/04/14 15:35:22 INFO DAGScheduler: Submitting ShuffleMapStage 57 
> (MapPartitionsRDD[7887] at run at AccessController.java:-2), which has no 
> missing parents
> 16/04/14 15:35:22 DEBUG DAGScheduler: submitMissingTasks(ShuffleMapStage 57)
> 16/04/14 15:35:22 INFO DAGScheduler: Submitting 1 missing tasks from 
> ShuffleMapStage 57 (MapPartitionsRDD[7887] at run at AccessController.java:-2)
> 16/04/14 15:35:22 DEBUG DAGScheduler: New pending partitions: Set(26)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to