GitHub user gaborgsomogyi opened a pull request:

    https://github.com/apache/spark/pull/20888

    [SPARK-23775][TEST] DataFrameRangeSuite should wait for first stage

    ## What changes were proposed in this pull request?
    
    DataFrameRangeSuite.test("Cancelling stage in a query with Range.") stays 
sometimes in an infinite loop and times out the build.
    
    I presume the original intention of this test is to start a job with range 
and just cancel it.
    The submitted job has 2 stages but I think the author tried to cancel the 
first stage with ID 0 which is not the case here:
    
    ```
    eventually(timeout(10.seconds), interval(1.millis)) {
      assert(DataFrameRangeSuite.stageToKill > 0)
    }
    ```
    
    All in all if the first stage is slower than 10 seconds it throws 
TestFailedDueToTimeoutException and cancelStage will be never ever called.
    
    This PR changes the test behaviour to wait for the first valid task ID and 
cancel that one.
    
    ## How was this patch tested?
    
    Existing unit test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gaborgsomogyi/spark SPARK-23775

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20888.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20888
    
----
commit 42c930d694e0bbc66974516b6719a698d664f681
Author: Gabor Somogyi <gabor.g.somogyi@...>
Date:   2018-03-23T02:37:27Z

    [SPARK-23775][TEST] DataFrameRangeSuite should wait for first stage

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to