[GitHub] spark pull request #21214: [SPARK-23775][TEST] Make DataFrameRangeSuite not ...

vanzin Wed, 02 May 2018 14:56:04 -0700

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21214#discussion_r185650518
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala ---
    @@ -153,23 +153,17 @@ class DataFrameRangeSuite extends QueryTest with 
SharedSQLContext with Eventuall
     
       test("Cancelling stage in a query with Range.") {
         val listener = new SparkListener {
    -      override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
    -        eventually(timeout(10.seconds), interval(1.millis)) {
    -          assert(DataFrameRangeSuite.stageToKill > 0)
    -        }
    -        sparkContext.cancelStage(DataFrameRangeSuite.stageToKill)
    +      override def onTaskStart(taskStart: SparkListenerTaskStart): Unit = {
    +        sparkContext.cancelStage(taskStart.stageId)
           }
         }
     
         sparkContext.addSparkListener(listener)
         for (codegen <- Seq(true, false)) {
           withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> 
codegen.toString()) {
    -        DataFrameRangeSuite.stageToKill = -1
             val ex = intercept[SparkException] {
    -          spark.range(0, 100000000000L, 1, 1).map { x =>
    -            DataFrameRangeSuite.stageToKill = TaskContext.get().stageId()
    -            x
    -          }.toDF("id").agg(sum("id")).collect()
    +          spark.range(0, 100000000000L, 1, 1)
    --- End diff --
    
    This is ok-ish but this kind of test is still racy. There's no guarantee 
the job won't finish before the events are posted to the bus, processed by the 
listener, and the stage is cancelled. The large count is just an attempt to 
make that less likely.
    
    You could use a `CountDownLatch` for that - wait for it in the task (so 
that the task start event is fired), and signal it in the listener.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21214: [SPARK-23775][TEST] Make DataFrameRangeSuite not ...

Reply via email to