[ https://issues.apache.org/jira/browse/SPARK-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344602#comment-14344602 ]
Cheng Lian commented on SPARK-6020: ----------------------------------- Hey [~andrewor14], I think [PR #4835|https://github.com/apache/spark/pull/4835] has already fixed this PR. {{InMemoryColumnarTableScan}} uses accumulators to generate debugging information for testing purposes. Test failures related to this JIRA ticket showed that accumulator updates got lost nondeterministically, which was also exactly what PR #4835 fixed. Also, according to [your amazing statistics sheets|https://docs.google.com/spreadsheets/d/1VSCTXLBqnglk0XMd0R4IhvUPb2MEDQUaRNwxcHBtSlk/edit#gid=52877182], this test suite hadn't been flaky for a week. > Flaky test: o.a.s.sql.columnar.PartitionBatchPruningSuite > --------------------------------------------------------- > > Key: SPARK-6020 > URL: https://issues.apache.org/jira/browse/SPARK-6020 > Project: Spark > Issue Type: Bug > Components: SQL, Tests > Affects Versions: 1.3.0 > Reporter: Andrew Or > Assignee: Cheng Lian > Priority: Critical > > Observed in the following builds, only one of which has something to do with > SQL: > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27931/ > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27930/ > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27929/ > org.apache.spark.sql.columnar.PartitionBatchPruningSuite.SELECT key FROM > pruningData WHERE NOT (key IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, > 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)) > {code} > Error Message > 8 did not equal 10 Wrong number of read batches: == Parsed Logical Plan == > 'Project ['key] 'Filter NOT 'key IN > (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) > 'UnresolvedRelation [pruningData], None == Analyzed Logical Plan == > Project [key#5245] Filter NOT key#5245 IN > (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) > LogicalRDD [key#5245,value#5246], MapPartitionsRDD[3202] at mapPartitions > at ExistingRDD.scala:35 == Optimized Logical Plan == Project [key#5245] > Filter NOT key#5245 INSET > (5,10,24,25,14,20,29,1,6,28,21,9,13,2,17,22,27,12,7,3,18,16,11,26,23,8,30,19,4,15) > InMemoryRelation [key#5245,value#5246], true, 10, StorageLevel(true, true, > false, true, 1), (PhysicalRDD [key#5245,value#5246], MapPartitionsRDD[3202] > at mapPartitions at ExistingRDD.scala:35), Some(pruningData) == Physical > Plan == Filter NOT key#5245 INSET > (5,10,24,25,14,20,29,1,6,28,21,9,13,2,17,22,27,12,7,3,18,16,11,26,23,8,30,19,4,15) > InMemoryColumnarTableScan [key#5245], [NOT key#5245 INSET > (5,10,24,25,14,20,29,1,6,28,21,9,13,2,17,22,27,12,7,3,18,16,11,26,23,8,30,19,4,15)], > (InMemoryRelation [key#5245,value#5246], true, 10, StorageLevel(true, true, > false, true, 1), (PhysicalRDD [key#5245,value#5246], MapPartitionsRDD[3202] > at mapPartitions at ExistingRDD.scala:35), Some(pruningData)) Code > Generation: false == RDD == > Stacktrace > sbt.ForkMain$ForkError: 8 did not equal 10 Wrong number of read batches: == > Parsed Logical Plan == > 'Project ['key] > 'Filter NOT 'key IN > (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) > 'UnresolvedRelation [pruningData], None > == Analyzed Logical Plan == > Project [key#5245] > Filter NOT key#5245 IN > (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) > LogicalRDD [key#5245,value#5246], MapPartitionsRDD[3202] at mapPartitions > at ExistingRDD.scala:35 > == Optimized Logical Plan == > Project [key#5245] > Filter NOT key#5245 INSET > (5,10,24,25,14,20,29,1,6,28,21,9,13,2,17,22,27,12,7,3,18,16,11,26,23,8,30,19,4,15) > InMemoryRelation [key#5245,value#5246], true, 10, StorageLevel(true, true, > false, true, 1), (PhysicalRDD [key#5245,value#5246], MapPartitionsRDD[3202] > at mapPartitions at ExistingRDD.scala:35), Some(pruningData) > == Physical Plan == > Filter NOT key#5245 INSET > (5,10,24,25,14,20,29,1,6,28,21,9,13,2,17,22,27,12,7,3,18,16,11,26,23,8,30,19,4,15) > InMemoryColumnarTableScan [key#5245], [NOT key#5245 INSET > (5,10,24,25,14,20,29,1,6,28,21,9,13,2,17,22,27,12,7,3,18,16,11,26,23,8,30,19,4,15)], > (InMemoryRelation [key#5245,value#5246], true, 10, StorageLevel(true, true, > false, true, 1), (PhysicalRDD [key#5245,value#5246], MapPartitionsRDD[3202] > at mapPartitions at ExistingRDD.scala:35), Some(pruningData)) > Code Generation: false > == RDD == > at > org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500) > at > org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555) > at > org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466) > at > org.apache.spark.sql.columnar.PartitionBatchPruningSuite$$anonfun$checkBatchPruning$1.apply$mcV$sp(PartitionBatchPruningSuite.scala:119) > at > org.apache.spark.sql.columnar.PartitionBatchPruningSuite$$anonfun$checkBatchPruning$1.apply(PartitionBatchPruningSuite.scala:107) > at > org.apache.spark.sql.columnar.PartitionBatchPruningSuite$$anonfun$checkBatchPruning$1.apply(PartitionBatchPruningSuite.scala:107) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at org.scalatest.Suite$class.withFixture(Suite.scala:1122) > at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at > org.apache.spark.sql.columnar.PartitionBatchPruningSuite.org$scalatest$BeforeAndAfter$$super$runTest(PartitionBatchPruningSuite.scala:26) > at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200) > at > org.apache.spark.sql.columnar.PartitionBatchPruningSuite.runTest(PartitionBatchPruningSuite.scala:26) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:318) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) > at > org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396) > at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483) > at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208) > at org.scalatest.FunSuite.runTests(FunSuite.scala:1555) > at org.scalatest.Suite$class.run(Suite.scala:1424) > at > org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) > at org.scalatest.SuperEngine.runImpl(Engine.scala:545) > at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212) > at > org.apache.spark.sql.columnar.PartitionBatchPruningSuite.org$scalatest$BeforeAndAfterAll$$super$run(PartitionBatchPruningSuite.scala:26) > at > org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257) > at > org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256) > at > org.apache.spark.sql.columnar.PartitionBatchPruningSuite.org$scalatest$BeforeAndAfter$$super$run(PartitionBatchPruningSuite.scala:26) > at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241) > at > org.apache.spark.sql.columnar.PartitionBatchPruningSuite.run(PartitionBatchPruningSuite.scala:26) > at > org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462) > at > org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org