[ 
https://issues.apache.org/jira/browse/SPARK-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Baoxu Shi updated SPARK-2228:
-----------------------------

    Summary: onStageSubmitted does not properly called so NoSuchElement will be 
thrown in onStageCompleted  (was: onStageSubmitted does not properly called so 
NoSuchElement will throw in onStageCompleted)

> onStageSubmitted does not properly called so NoSuchElement will be thrown in 
> onStageCompleted
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-2228
>                 URL: https://issues.apache.org/jira/browse/SPARK-2228
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.0, 1.1.0
>            Reporter: Baoxu Shi
>
> We are using `SaveAsObjectFile` and `objectFile` to cut off lineage during 
> iterative computing, but after several hundreds of iterations, there will be 
> `NoSuchElementsError`. We check the code and locate the problem at 
> `org.apache.spark.ui.jobs.JobProgressListener`. When `onStageCompleted` is 
> called, such `stageId` can not be found in `stageIdToPool`, but it does exist 
> in other HashMaps. So we think `onStageSubmitted` is not properly called. 
> `Spark` did add a stage but failed to send the message to listeners. When 
> sending `finish` message to listeners, the error occurs. 
> This problem will cause a huge number of `active stages` showing in 
> `SparkUI`, which is really annoying. But it may not affect the final result, 
> according to the result of my testing code.
> I'm willing to help solve this problem, any idea about which part should I 
> change? I assume `org.apache.spark.scheduler.SparkListenerBus` have something 
> to do with it but it looks fine to me.
> FYI, here is the test code that could reproduce the problem. I do not know 
> who to put code here with highlight, so I put the code on gist to make the 
> issue looks clean.
> https://gist.github.com/bxshi/b5c0fe0ae089c75a39bd



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to