Re: All of the tasks have been completed but the Stage is still shown as Active?
Similarly, I am seeing tasks moved to the completed section which apparently haven't finished all elements... (succeeded/total 1)... is this related? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/All-of-the-tasks-have-been-completed-but-the-Stage-is-still-shown-as-Active-tp9274p11725.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: All of the tasks have been completed but the Stage is still shown as Active?
Seems like it is related. Possibly those PRs that Andrew mentioned are going to fix this issue. On Fri, Jul 11, 2014 at 5:51 AM, Haopu Wang hw...@qilinsoft.com wrote: I saw some exceptions like this in driver log. Can you shed some lights? Is it related with the behaviour? 14/07/11 20:40:09 ERROR LiveListenerBus: Listener JobProgressListener threw an exception java.util.NoSuchElementException: key not found: 64019 at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.mutable.HashMap.apply(HashMap.scala:64) at org.apache.spark.ui.jobs.JobProgressListener.onStageCompleted(JobProgressListener.scala:78) at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$2.apply(SparkListenerBus.scala:48) at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$2.apply(SparkListenerBus.scala:48) at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81) at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:79) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:79) at org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:48) at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1160) at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46) -- *From:* Haopu Wang *Sent:* Thursday, July 10, 2014 7:38 PM *To:* user@spark.apache.org *Subject:* RE: All of the tasks have been completed but the Stage is still shown as Active? I didn't keep the driver's log. It's a lesson. I will try to run it again to see if it happens again. -- *From:* Tathagata Das [mailto:tathagata.das1...@gmail.com] *Sent:* 2014年7月10日 17:29 *To:* user@spark.apache.org *Subject:* Re: All of the tasks have been completed but the Stage is still shown as Active? Do you see any errors in the logs of the driver? On Thu, Jul 10, 2014 at 1:21 AM, Haopu Wang hw...@qilinsoft.com wrote: I'm running an App for hours in a standalone cluster. From the data injector and Streaming tab of web ui, it's running well. However, I see quite a lot of Active stages in web ui even some of them have all of their tasks completed. I attach a screenshot for your reference. Do you ever see this kind of behavior?
RE: All of the tasks have been completed but the Stage is still shown as Active?
I saw some exceptions like this in driver log. Can you shed some lights? Is it related with the behaviour? 14/07/11 20:40:09 ERROR LiveListenerBus: Listener JobProgressListener threw an exception java.util.NoSuchElementException: key not found: 64019 at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.mutable.HashMap.apply(HashMap.scala:64) at org.apache.spark.ui.jobs.JobProgressListener.onStageCompleted(JobProgressListener.scala:78) at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$2.apply(SparkListenerBus.scala:48) at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$2.apply(SparkListenerBus.scala:48) at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81) at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:79) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:79) at org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:48) at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1160) at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46) From: Haopu Wang Sent: Thursday, July 10, 2014 7:38 PM To: user@spark.apache.org Subject: RE: All of the tasks have been completed but the Stage is still shown as Active? I didn't keep the driver's log. It's a lesson. I will try to run it again to see if it happens again. From: Tathagata Das [mailto:tathagata.das1...@gmail.com] Sent: 2014年7月10日 17:29 To: user@spark.apache.org Subject: Re: All of the tasks have been completed but the Stage is still shown as Active? Do you see any errors in the logs of the driver? On Thu, Jul 10, 2014 at 1:21 AM, Haopu Wang hw...@qilinsoft.com wrote: I'm running an App for hours in a standalone cluster. From the data injector and Streaming tab of web ui, it's running well. However, I see quite a lot of Active stages in web ui even some of them have all of their tasks completed. I attach a screenshot for your reference. Do you ever see this kind of behavior?
Re: All of the tasks have been completed but the Stage is still shown as Active?
Do you see any errors in the logs of the driver? On Thu, Jul 10, 2014 at 1:21 AM, Haopu Wang hw...@qilinsoft.com wrote: I'm running an App for hours in a standalone cluster. From the data injector and Streaming tab of web ui, it's running well. However, I see quite a lot of Active stages in web ui even some of them have all of their tasks completed. I attach a screenshot for your reference. Do you ever see this kind of behavior?
Re: All of the tasks have been completed but the Stage is still shown as Active?
History Server is also very helpful. On Thu, Jul 10, 2014 at 7:37 AM, Haopu Wang hw...@qilinsoft.com wrote: I didn't keep the driver's log. It's a lesson. I will try to run it again to see if it happens again. -- *From:* Tathagata Das [mailto:tathagata.das1...@gmail.com] *Sent:* 2014年7月10日 17:29 *To:* user@spark.apache.org *Subject:* Re: All of the tasks have been completed but the Stage is still shown as Active? Do you see any errors in the logs of the driver? On Thu, Jul 10, 2014 at 1:21 AM, Haopu Wang hw...@qilinsoft.com wrote: I'm running an App for hours in a standalone cluster. From the data injector and Streaming tab of web ui, it's running well. However, I see quite a lot of Active stages in web ui even some of them have all of their tasks completed. I attach a screenshot for your reference. Do you ever see this kind of behavior? -- SUREN HIRAMAN, VP TECHNOLOGY Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR NEW YORK, NY 10001 O: (917) 525-2466 ext. 105 F: 646.349.4063 E: suren.hiraman@v suren.hira...@sociocast.comelos.io W: www.velos.io
Re: All of the tasks have been completed but the Stage is still shown as Active?
One thing to keep in mind is that the progress bar doesn't take into account tasks which are rerun. If you see 4/4 but the stage is still active, click the stage name and look at the task list. That will show you if any are actually running. When rerun tasks complete, it can result in the number of successful tasks being greater than the number of total tasks; e.g. the progress bar might display 5/4. Another bug is that a stage might complete and be moved to the completed list, but if tasks are then rerun it will appear in both the completed and active stages list. If it completes again, you will see that stage *twice* in the completed stages list. Of course, you should only be seeing this behavior if things are going wrong; a node failing, for example. On Thu, Jul 10, 2014 at 4:21 AM, Haopu Wang hw...@qilinsoft.com wrote: I'm running an App for hours in a standalone cluster. From the data injector and Streaming tab of web ui, it's running well. However, I see quite a lot of Active stages in web ui even some of them have all of their tasks completed. I attach a screenshot for your reference. Do you ever see this kind of behavior? -- Daniel Siegmann, Software Developer Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 E: daniel.siegm...@velos.io W: www.velos.io
Re: All of the tasks have been completed but the Stage is still shown as Active?
Yes, there are a few bugs in the UI in the event of a node failure. The duplicated stages in both the active and completed tables should be fixed by this PR: https://github.com/apache/spark/pull/1262 The fact that the progress bar on the stages page displays an overflow (e.g. 5/4) is still an open issue, but a related PR fixed the tasks page side of it: https://github.com/apache/spark/pull/1236 (merged) Keep reporting any additional anomalies you observe (or better yet, file a JIRA here https://issues.apache.org/jira/browse/SPARK)! 2014-07-10 7:09 GMT-07:00 Daniel Siegmann daniel.siegm...@velos.io: One thing to keep in mind is that the progress bar doesn't take into account tasks which are rerun. If you see 4/4 but the stage is still active, click the stage name and look at the task list. That will show you if any are actually running. When rerun tasks complete, it can result in the number of successful tasks being greater than the number of total tasks; e.g. the progress bar might display 5/4. Another bug is that a stage might complete and be moved to the completed list, but if tasks are then rerun it will appear in both the completed and active stages list. If it completes again, you will see that stage *twice* in the completed stages list. Of course, you should only be seeing this behavior if things are going wrong; a node failing, for example. On Thu, Jul 10, 2014 at 4:21 AM, Haopu Wang hw...@qilinsoft.com wrote: I'm running an App for hours in a standalone cluster. From the data injector and Streaming tab of web ui, it's running well. However, I see quite a lot of Active stages in web ui even some of them have all of their tasks completed. I attach a screenshot for your reference. Do you ever see this kind of behavior? -- Daniel Siegmann, Software Developer Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 E: daniel.siegm...@velos.io W: www.velos.io