[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite

2014-05-16 Thread Tathagata Das (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999241#comment-13999241
 ] 

Tathagata Das commented on SPARK-1603:
--

I think we havent seen the flakiness since then. So I am marking this as 
resolved. 

 flaky test case in StreamingContextSuite
 

 Key: SPARK-1603
 URL: https://issues.apache.org/jira/browse/SPARK-1603
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9.0, 1.0.0, 0.9.1
Reporter: Nan Zhu
Assignee: Nan Zhu

 When Jenkins was testing 5 PRs at the same time, the test results in my PR 
 shows that  stop gracefully in StreamingContextSuite failed, 
 the stacktrace is as
 {quote}
  stop gracefully *** FAILED *** (8 seconds, 350 milliseconds)
 [info]   akka.actor.InvalidActorNameException: actor name [JobScheduler] is 
 not unique!
 [info]   at 
 akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192)
 [info]   at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
 [info]   at akka.actor.ActorCell.reserveChild(ActorCell.scala:338)
 [info]   at akka.actor.dungeon.Children$class.makeChild(Children.scala:186)
 [info]   at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
 [info]   at akka.actor.ActorCell.attachChild(ActorCell.scala:338)
 [info]   at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518)
 [info]   at 
 org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:57)
 [info]   at 
 org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:434)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14$$anonfun$apply$mcV$sp$3.apply$mcVI$sp(StreamingContextSuite.scala:174)
 [info]   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply$mcV$sp(StreamingContextSuite.scala:163)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159)
 [info]   at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265)
 [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1974)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.withFixture(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271)
 [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:198)
 [info]   at org.scalatest.FunSuite$class.runTest(FunSuite.scala:1271)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$runTest(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:171)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.runTest(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:260)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:249)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:249)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:326)
 [info]   at org.scalatest.FunSuite$class.runTests(FunSuite.scala:1304)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.runTests(StreamingContextSuite.scala:34)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:2303)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$FunSuite$$super$run(StreamingContextSuite.scala:34)
 [info]   at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310)
 [info]   at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310)
 [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:362)
 [info]   at org.scalatest.FunSuite$class.run(FunSuite.scala:1310)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$run(StreamingContextSuite.scala:34)
 [info]   at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:208)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.run(StreamingContextSuite.scala:34)
 [info]   at 
 

[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite

2014-05-05 Thread Nan Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990209#comment-13990209
 ] 

Nan Zhu commented on SPARK-1603:


[~tdas], I checked the code, the fix itself should be pretty easy, just use 
autogenerated name is OK, 

the reason is that akka.system.stop() is an asynchronous method, which means 
that you have no guarantee on when the actor is really stopped...

 flaky test case in StreamingContextSuite
 

 Key: SPARK-1603
 URL: https://issues.apache.org/jira/browse/SPARK-1603
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9.0, 1.0.0, 0.9.1
Reporter: Nan Zhu
Assignee: Nan Zhu

 When Jenkins was testing 5 PRs at the same time, the test results in my PR 
 shows that  stop gracefully in StreamingContextSuite failed, 
 the stacktrace is as
 {quote}
  stop gracefully *** FAILED *** (8 seconds, 350 milliseconds)
 [info]   akka.actor.InvalidActorNameException: actor name [JobScheduler] is 
 not unique!
 [info]   at 
 akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192)
 [info]   at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
 [info]   at akka.actor.ActorCell.reserveChild(ActorCell.scala:338)
 [info]   at akka.actor.dungeon.Children$class.makeChild(Children.scala:186)
 [info]   at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
 [info]   at akka.actor.ActorCell.attachChild(ActorCell.scala:338)
 [info]   at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518)
 [info]   at 
 org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:57)
 [info]   at 
 org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:434)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14$$anonfun$apply$mcV$sp$3.apply$mcVI$sp(StreamingContextSuite.scala:174)
 [info]   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply$mcV$sp(StreamingContextSuite.scala:163)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159)
 [info]   at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265)
 [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1974)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.withFixture(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271)
 [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:198)
 [info]   at org.scalatest.FunSuite$class.runTest(FunSuite.scala:1271)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$runTest(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:171)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.runTest(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:260)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:249)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:249)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:326)
 [info]   at org.scalatest.FunSuite$class.runTests(FunSuite.scala:1304)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.runTests(StreamingContextSuite.scala:34)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:2303)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$FunSuite$$super$run(StreamingContextSuite.scala:34)
 [info]   at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310)
 [info]   at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310)
 [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:362)
 [info]   at org.scalatest.FunSuite$class.run(FunSuite.scala:1310)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$run(StreamingContextSuite.scala:34)
 [info]   at 

[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite

2014-05-05 Thread Tathagata Das (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990213#comment-13990213
 ] 

Tathagata Das commented on SPARK-1603:
--

I think I made a temporary fix for now for that test. Bundled it with this PR.
https://github.com/apache/spark/pull/652/files#diff-e144dbee130ed84f9465853ddce65f8eR186
 

I am a little afraid that there is some corner case that leads to actors being 
leaked and adding auto-generated names would mask that problem. So I just added 
a Thread.sleep(100) in that graceful shutdown test so that it gives the system 
time to stop and cleanup the actor before a new StreamingContext is started 
(100 ms should be enough). If the problem persists (test is still flaky) then 
it is more likely that there is corner where the actor is not being stopped 
every time.

Will close this JIRA after a few days of observation.

 flaky test case in StreamingContextSuite
 

 Key: SPARK-1603
 URL: https://issues.apache.org/jira/browse/SPARK-1603
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9.0, 1.0.0, 0.9.1
Reporter: Nan Zhu
Assignee: Nan Zhu

 When Jenkins was testing 5 PRs at the same time, the test results in my PR 
 shows that  stop gracefully in StreamingContextSuite failed, 
 the stacktrace is as
 {quote}
  stop gracefully *** FAILED *** (8 seconds, 350 milliseconds)
 [info]   akka.actor.InvalidActorNameException: actor name [JobScheduler] is 
 not unique!
 [info]   at 
 akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192)
 [info]   at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
 [info]   at akka.actor.ActorCell.reserveChild(ActorCell.scala:338)
 [info]   at akka.actor.dungeon.Children$class.makeChild(Children.scala:186)
 [info]   at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
 [info]   at akka.actor.ActorCell.attachChild(ActorCell.scala:338)
 [info]   at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518)
 [info]   at 
 org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:57)
 [info]   at 
 org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:434)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14$$anonfun$apply$mcV$sp$3.apply$mcVI$sp(StreamingContextSuite.scala:174)
 [info]   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply$mcV$sp(StreamingContextSuite.scala:163)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159)
 [info]   at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265)
 [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1974)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.withFixture(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271)
 [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:198)
 [info]   at org.scalatest.FunSuite$class.runTest(FunSuite.scala:1271)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$runTest(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:171)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.runTest(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:260)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:249)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:249)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:326)
 [info]   at org.scalatest.FunSuite$class.runTests(FunSuite.scala:1304)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.runTests(StreamingContextSuite.scala:34)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:2303)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$FunSuite$$super$run(StreamingContextSuite.scala:34)
 [info]   at 

[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite

2014-05-05 Thread Nan Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990216#comment-13990216
 ] 

Nan Zhu commented on SPARK-1603:


Ah, just made a PR at the same time, https://github.com/apache/spark/pull/659

I'm afraid a fixed time threshold cannot resolve the problem (especially that 
this case is hard to reproduce, always happen when Jenkins is super 
overloaded)I once met the similar problem in another PR: 
https://github.com/apache/spark/pull/186, there, we have an asynchronous
constructor...

you are right, we should check if there are some cases we forgot to close the 
actor (but I think since you call ssc.stop() after each test case, the actor 
should be closed eventually)

 flaky test case in StreamingContextSuite
 

 Key: SPARK-1603
 URL: https://issues.apache.org/jira/browse/SPARK-1603
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9.0, 1.0.0, 0.9.1
Reporter: Nan Zhu
Assignee: Nan Zhu

 When Jenkins was testing 5 PRs at the same time, the test results in my PR 
 shows that  stop gracefully in StreamingContextSuite failed, 
 the stacktrace is as
 {quote}
  stop gracefully *** FAILED *** (8 seconds, 350 milliseconds)
 [info]   akka.actor.InvalidActorNameException: actor name [JobScheduler] is 
 not unique!
 [info]   at 
 akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192)
 [info]   at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
 [info]   at akka.actor.ActorCell.reserveChild(ActorCell.scala:338)
 [info]   at akka.actor.dungeon.Children$class.makeChild(Children.scala:186)
 [info]   at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
 [info]   at akka.actor.ActorCell.attachChild(ActorCell.scala:338)
 [info]   at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518)
 [info]   at 
 org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:57)
 [info]   at 
 org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:434)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14$$anonfun$apply$mcV$sp$3.apply$mcVI$sp(StreamingContextSuite.scala:174)
 [info]   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply$mcV$sp(StreamingContextSuite.scala:163)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159)
 [info]   at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265)
 [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1974)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.withFixture(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271)
 [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:198)
 [info]   at org.scalatest.FunSuite$class.runTest(FunSuite.scala:1271)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$runTest(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:171)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.runTest(StreamingContextSuite.scala:34)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304)
 [info]   at 
 org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:260)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:249)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:249)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:326)
 [info]   at org.scalatest.FunSuite$class.runTests(FunSuite.scala:1304)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.runTests(StreamingContextSuite.scala:34)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:2303)
 [info]   at 
 org.apache.spark.streaming.StreamingContextSuite.org$scalatest$FunSuite$$super$run(StreamingContextSuite.scala:34)
 [info]   at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310)
 [info]   at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310)
 [info]   at