[ 
https://issues.apache.org/jira/browse/SPARK-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727617#comment-14727617
 ] 

Imran Rashid commented on SPARK-6880:
-------------------------------------

Explanation of the remaining issue from [~markhamstra] on why the previous fix 
was an improvement, and did completely remove the NPE, but wasn't quite the 
right fix, because it left some minor issues:

bq. Tasks for a Stage that was previously part of a Job that is no longer 
active would be re-submitted as though they were part of the prior Job and with 
no properties set. Since properties are what are used to set an 
other-than-default scheduling pool, this would affect FAIR scheduler usage, but 
it would also affect anything else that depends on the settings of the 
properties (which would be just user code at this point, since Spark itself 
doesn't really use the properties for anything else other than Job Group and 
Description, which end up in the WebUI, can be used to kill by JobGroup, etc.) 
Even the default, FIFO scheduling would be affected, however, since the 
resubmission of the Tasks under the earlier jobId would effectively give them a 
higher priority/greater urgency than the ActiveJob that now actually needs 
them.  In any event, the Tasks would generate correct results.

> Spark Shutdowns with NoSuchElementException when running parallel collect on 
> cachedRDD
> --------------------------------------------------------------------------------------
>
>                 Key: SPARK-6880
>                 URL: https://issues.apache.org/jira/browse/SPARK-6880
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.1
>         Environment: CentOs6.0, java7
>            Reporter: pankaj arora
>            Assignee: pankaj arora
>             Fix For: 1.4.0
>
>
> Spark Shutdowns with NoSuchElementException when running parallel collect on 
> cachedRDDs
> Below is the stack trace
> 15/03/27 11:12:43 ERROR DAGSchedulerActorSupervisor: eventProcesserActor 
> failed; shutting down SparkContext
> java.util.NoSuchElementException: key not found: 28
>         at scala.collection.MapLike$class.default(MapLike.scala:228)
>         at scala.collection.AbstractMap.default(Map.scala:58)
>         at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:808)
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:778)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:781)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:780)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:780)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:781)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:780)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:780)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:781)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:780)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:780)
>         at 
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:762)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1389)
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:487)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to