[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite
[ https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999241#comment-13999241 ] Tathagata Das commented on SPARK-1603: -- I think we havent seen the flakiness since then. So I am marking this as resolved. flaky test case in StreamingContextSuite Key: SPARK-1603 URL: https://issues.apache.org/jira/browse/SPARK-1603 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 0.9.0, 1.0.0, 0.9.1 Reporter: Nan Zhu Assignee: Nan Zhu When Jenkins was testing 5 PRs at the same time, the test results in my PR shows that stop gracefully in StreamingContextSuite failed, the stacktrace is as {quote} stop gracefully *** FAILED *** (8 seconds, 350 milliseconds) [info] akka.actor.InvalidActorNameException: actor name [JobScheduler] is not unique! [info] at akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192) [info] at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77) [info] at akka.actor.ActorCell.reserveChild(ActorCell.scala:338) [info] at akka.actor.dungeon.Children$class.makeChild(Children.scala:186) [info] at akka.actor.dungeon.Children$class.attachChild(Children.scala:42) [info] at akka.actor.ActorCell.attachChild(ActorCell.scala:338) [info] at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518) [info] at org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:57) [info] at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:434) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14$$anonfun$apply$mcV$sp$3.apply$mcVI$sp(StreamingContextSuite.scala:174) [info] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply$mcV$sp(StreamingContextSuite.scala:163) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159) [info] at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265) [info] at org.scalatest.Suite$class.withFixture(Suite.scala:1974) [info] at org.apache.spark.streaming.StreamingContextSuite.withFixture(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262) [info] at org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271) [info] at org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:198) [info] at org.scalatest.FunSuite$class.runTest(FunSuite.scala:1271) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$runTest(StreamingContextSuite.scala:34) [info] at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:171) [info] at org.apache.spark.streaming.StreamingContextSuite.runTest(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304) [info] at org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304) [info] at org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:260) [info] at org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:249) [info] at scala.collection.immutable.List.foreach(List.scala:318) [info] at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:249) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:326) [info] at org.scalatest.FunSuite$class.runTests(FunSuite.scala:1304) [info] at org.apache.spark.streaming.StreamingContextSuite.runTests(StreamingContextSuite.scala:34) [info] at org.scalatest.Suite$class.run(Suite.scala:2303) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$FunSuite$$super$run(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310) [info] at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310) [info] at org.scalatest.SuperEngine.runImpl(Engine.scala:362) [info] at org.scalatest.FunSuite$class.run(FunSuite.scala:1310) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$run(StreamingContextSuite.scala:34) [info] at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:208) [info] at org.apache.spark.streaming.StreamingContextSuite.run(StreamingContextSuite.scala:34) [info] at
[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite
[ https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990209#comment-13990209 ] Nan Zhu commented on SPARK-1603: [~tdas], I checked the code, the fix itself should be pretty easy, just use autogenerated name is OK, the reason is that akka.system.stop() is an asynchronous method, which means that you have no guarantee on when the actor is really stopped... flaky test case in StreamingContextSuite Key: SPARK-1603 URL: https://issues.apache.org/jira/browse/SPARK-1603 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 0.9.0, 1.0.0, 0.9.1 Reporter: Nan Zhu Assignee: Nan Zhu When Jenkins was testing 5 PRs at the same time, the test results in my PR shows that stop gracefully in StreamingContextSuite failed, the stacktrace is as {quote} stop gracefully *** FAILED *** (8 seconds, 350 milliseconds) [info] akka.actor.InvalidActorNameException: actor name [JobScheduler] is not unique! [info] at akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192) [info] at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77) [info] at akka.actor.ActorCell.reserveChild(ActorCell.scala:338) [info] at akka.actor.dungeon.Children$class.makeChild(Children.scala:186) [info] at akka.actor.dungeon.Children$class.attachChild(Children.scala:42) [info] at akka.actor.ActorCell.attachChild(ActorCell.scala:338) [info] at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518) [info] at org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:57) [info] at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:434) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14$$anonfun$apply$mcV$sp$3.apply$mcVI$sp(StreamingContextSuite.scala:174) [info] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply$mcV$sp(StreamingContextSuite.scala:163) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159) [info] at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265) [info] at org.scalatest.Suite$class.withFixture(Suite.scala:1974) [info] at org.apache.spark.streaming.StreamingContextSuite.withFixture(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262) [info] at org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271) [info] at org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:198) [info] at org.scalatest.FunSuite$class.runTest(FunSuite.scala:1271) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$runTest(StreamingContextSuite.scala:34) [info] at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:171) [info] at org.apache.spark.streaming.StreamingContextSuite.runTest(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304) [info] at org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304) [info] at org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:260) [info] at org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:249) [info] at scala.collection.immutable.List.foreach(List.scala:318) [info] at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:249) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:326) [info] at org.scalatest.FunSuite$class.runTests(FunSuite.scala:1304) [info] at org.apache.spark.streaming.StreamingContextSuite.runTests(StreamingContextSuite.scala:34) [info] at org.scalatest.Suite$class.run(Suite.scala:2303) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$FunSuite$$super$run(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310) [info] at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310) [info] at org.scalatest.SuperEngine.runImpl(Engine.scala:362) [info] at org.scalatest.FunSuite$class.run(FunSuite.scala:1310) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$run(StreamingContextSuite.scala:34) [info] at
[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite
[ https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990213#comment-13990213 ] Tathagata Das commented on SPARK-1603: -- I think I made a temporary fix for now for that test. Bundled it with this PR. https://github.com/apache/spark/pull/652/files#diff-e144dbee130ed84f9465853ddce65f8eR186 I am a little afraid that there is some corner case that leads to actors being leaked and adding auto-generated names would mask that problem. So I just added a Thread.sleep(100) in that graceful shutdown test so that it gives the system time to stop and cleanup the actor before a new StreamingContext is started (100 ms should be enough). If the problem persists (test is still flaky) then it is more likely that there is corner where the actor is not being stopped every time. Will close this JIRA after a few days of observation. flaky test case in StreamingContextSuite Key: SPARK-1603 URL: https://issues.apache.org/jira/browse/SPARK-1603 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 0.9.0, 1.0.0, 0.9.1 Reporter: Nan Zhu Assignee: Nan Zhu When Jenkins was testing 5 PRs at the same time, the test results in my PR shows that stop gracefully in StreamingContextSuite failed, the stacktrace is as {quote} stop gracefully *** FAILED *** (8 seconds, 350 milliseconds) [info] akka.actor.InvalidActorNameException: actor name [JobScheduler] is not unique! [info] at akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192) [info] at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77) [info] at akka.actor.ActorCell.reserveChild(ActorCell.scala:338) [info] at akka.actor.dungeon.Children$class.makeChild(Children.scala:186) [info] at akka.actor.dungeon.Children$class.attachChild(Children.scala:42) [info] at akka.actor.ActorCell.attachChild(ActorCell.scala:338) [info] at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518) [info] at org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:57) [info] at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:434) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14$$anonfun$apply$mcV$sp$3.apply$mcVI$sp(StreamingContextSuite.scala:174) [info] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply$mcV$sp(StreamingContextSuite.scala:163) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159) [info] at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265) [info] at org.scalatest.Suite$class.withFixture(Suite.scala:1974) [info] at org.apache.spark.streaming.StreamingContextSuite.withFixture(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262) [info] at org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271) [info] at org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:198) [info] at org.scalatest.FunSuite$class.runTest(FunSuite.scala:1271) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$runTest(StreamingContextSuite.scala:34) [info] at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:171) [info] at org.apache.spark.streaming.StreamingContextSuite.runTest(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304) [info] at org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304) [info] at org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:260) [info] at org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:249) [info] at scala.collection.immutable.List.foreach(List.scala:318) [info] at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:249) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:326) [info] at org.scalatest.FunSuite$class.runTests(FunSuite.scala:1304) [info] at org.apache.spark.streaming.StreamingContextSuite.runTests(StreamingContextSuite.scala:34) [info] at org.scalatest.Suite$class.run(Suite.scala:2303) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$FunSuite$$super$run(StreamingContextSuite.scala:34) [info] at
[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite
[ https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990216#comment-13990216 ] Nan Zhu commented on SPARK-1603: Ah, just made a PR at the same time, https://github.com/apache/spark/pull/659 I'm afraid a fixed time threshold cannot resolve the problem (especially that this case is hard to reproduce, always happen when Jenkins is super overloaded)I once met the similar problem in another PR: https://github.com/apache/spark/pull/186, there, we have an asynchronous constructor... you are right, we should check if there are some cases we forgot to close the actor (but I think since you call ssc.stop() after each test case, the actor should be closed eventually) flaky test case in StreamingContextSuite Key: SPARK-1603 URL: https://issues.apache.org/jira/browse/SPARK-1603 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 0.9.0, 1.0.0, 0.9.1 Reporter: Nan Zhu Assignee: Nan Zhu When Jenkins was testing 5 PRs at the same time, the test results in my PR shows that stop gracefully in StreamingContextSuite failed, the stacktrace is as {quote} stop gracefully *** FAILED *** (8 seconds, 350 milliseconds) [info] akka.actor.InvalidActorNameException: actor name [JobScheduler] is not unique! [info] at akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192) [info] at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77) [info] at akka.actor.ActorCell.reserveChild(ActorCell.scala:338) [info] at akka.actor.dungeon.Children$class.makeChild(Children.scala:186) [info] at akka.actor.dungeon.Children$class.attachChild(Children.scala:42) [info] at akka.actor.ActorCell.attachChild(ActorCell.scala:338) [info] at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518) [info] at org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:57) [info] at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:434) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14$$anonfun$apply$mcV$sp$3.apply$mcVI$sp(StreamingContextSuite.scala:174) [info] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply$mcV$sp(StreamingContextSuite.scala:163) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159) [info] at org.apache.spark.streaming.StreamingContextSuite$$anonfun$14.apply(StreamingContextSuite.scala:159) [info] at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265) [info] at org.scalatest.Suite$class.withFixture(Suite.scala:1974) [info] at org.apache.spark.streaming.StreamingContextSuite.withFixture(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262) [info] at org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271) [info] at org.scalatest.FunSuite$$anonfun$runTest$1.apply(FunSuite.scala:1271) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:198) [info] at org.scalatest.FunSuite$class.runTest(FunSuite.scala:1271) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$BeforeAndAfter$$super$runTest(StreamingContextSuite.scala:34) [info] at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:171) [info] at org.apache.spark.streaming.StreamingContextSuite.runTest(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304) [info] at org.scalatest.FunSuite$$anonfun$runTests$1.apply(FunSuite.scala:1304) [info] at org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:260) [info] at org.scalatest.SuperEngine$$anonfun$org$scalatest$SuperEngine$$runTestsInBranch$1.apply(Engine.scala:249) [info] at scala.collection.immutable.List.foreach(List.scala:318) [info] at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:249) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:326) [info] at org.scalatest.FunSuite$class.runTests(FunSuite.scala:1304) [info] at org.apache.spark.streaming.StreamingContextSuite.runTests(StreamingContextSuite.scala:34) [info] at org.scalatest.Suite$class.run(Suite.scala:2303) [info] at org.apache.spark.streaming.StreamingContextSuite.org$scalatest$FunSuite$$super$run(StreamingContextSuite.scala:34) [info] at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310) [info] at org.scalatest.FunSuite$$anonfun$run$1.apply(FunSuite.scala:1310) [info] at