[GitHub] spark pull request #21210: [SPARK-23489][SQL][TEST] HiveExternalCatalogVersi...

2018-05-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21210


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21210: [SPARK-23489][SQL][TEST] HiveExternalCatalogVersi...

2018-05-01 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/21210

[SPARK-23489][SQL][TEST] HiveExternalCatalogVersionsSuite should verify the 
downloaded file

## What changes were proposed in this pull request?

Although `HiveExternalCatalogVersionsSuite` designed to download from 
Apache mirrors three times, it has been flaky because it didn't verify the 
downloaded file. Some Apache mirrors terminate the downloading abnormally, the 
*corrupted* file shows the following errors.

```
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
22:46:32.700 WARN 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite: 

= POSSIBLE THREAD LEAK IN SUITE 
o.a.s.sql.hive.HiveExternalCatalogVersionsSuite, thread names: Keep-Alive-Timer 
=

*** RUN ABORTED ***
  java.io.IOException: Cannot run program "./bin/spark-submit" (in 
directory "/tmp/test-spark/spark-2.2.0"): error=2, No such file or directory
```

This has been reported weirdly in two ways. For example, the above case is 
reported as Case 2 `no failures`.

- Case 1. [Test Result (1 failure / 
+1)](https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/4389/)
- Case 2. [Test Result (no 
failures)](https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/4811/)

This PR aims to make `HiveExternalCatalogVersionsSuite` more robust by 
verifying the downloaded `tgz` file by extracting and checking the existence of 
`bin/spark-submit`. If it turns out that the file is empty or corrupted, 
`HiveExternalCatalogVersionsSuite` will do retry logic like the download 
failure.

## How was this patch tested?

Pass the Jenkins.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-23489

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21210.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21210


commit 51d4c0ed72c15893a112c39d9e360e4cfabe6a62
Author: Dongjoon Hyun 
Date:   2018-05-02T04:48:21Z

[SPARK-23489][SQL][TEST] HiveExternalCatalogVersionsSuite should verify the 
downloaded file




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org