[ https://issues.apache.org/jira/browse/SPARK-3882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174629#comment-14174629 ]
Patrick Wendell commented on SPARK-3882: ---------------------------------------- This is a known issue (SPARK-2316) that was fixed in Spark 1.1. To verify that you are hitting the same issue, would you mind testing your job with Spark 1.1 and seeing if you observe it? > JobProgressListener gets permanently out of sync with long running job > ---------------------------------------------------------------------- > > Key: SPARK-3882 > URL: https://issues.apache.org/jira/browse/SPARK-3882 > Project: Spark > Issue Type: Bug > Components: Web UI > Affects Versions: 1.0.2 > Reporter: Davis Shepherd > Attachments: Screen Shot 2014-10-03 at 12.50.59 PM.png > > > A long running spark context (non-streaming) will eventually start throwing > the following in the driver: > {code} > java.util.NoSuchElementException: key not found: 12771 > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:58) > at scala.collection.mutable.HashMap.apply(HashMap.scala:64) > at > org.apache.spark.ui.jobs.JobProgressListener.onStageCompleted(JobProgressListener.scala:79) > at > org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$2.apply(SparkListenerBus.scala:48) > at > org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$2.apply(SparkListenerBus.scala:48) > at > org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81) > at > org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:79) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:79) > at > org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:48) > at > org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1160) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46) > 2014-10-09 18:45:33,523 [SparkListenerBus] ERROR > org.apache.spark.scheduler.LiveListenerBus - Listener JobProgressListener > threw an exception > java.util.NoSuchElementException: key not found: 12782 > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:58) > at scala.collection.mutable.HashMap.apply(HashMap.scala:64) > at > org.apache.spark.ui.jobs.JobProgressListener.onStageCompleted(JobProgressListener.scala:79) > at > org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$2.apply(SparkListenerBus.scala:48) > at > org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$2.apply(SparkListenerBus.scala:48) > at > org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81) > at > org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:79) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:79) > at > org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:48) > at > org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1160) > at > org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46) > {code} > And the ui will show running jobs that are in fact no longer running and > never clean them up. (see attached screenshot) > The result is that the ui becomes unusable, and the JobProgressListener leaks > memory as the list of "running" jobs continues to grow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org