spark git commit: [SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths

gurwls223 Wed, 30 Aug 2017 05:36:45 -0700

Repository: spark
Updated Branches:
  refs/heads/master 734ed7a7b -> b30a11a6a



[SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed 
and incorrect paths

## What changes were proposed in this pull request?

`org.apache.spark.deploy.RPackageUtilsSuite`

```
 - jars without manifest return false *** FAILED *** (109 milliseconds)
   java.io.IOException: Unable to delete file: 
C:\projects\spark\target\tmp\1500266936418-0\dep1-c.jar
```

`org.apache.spark.deploy.SparkSubmitSuite`

```
 - download one file to local *** FAILED *** (16 milliseconds)
   java.net.URISyntaxException: Illegal character in authority at index 6: 
s3a://C:\projects\spark\target\tmp\test2630198944759847458.jar

 - download list of files to local *** FAILED *** (0 milliseconds)
   java.net.URISyntaxException: Illegal character in authority at index 6: 
s3a://C:\projects\spark\target\tmp\test2783551769392880031.jar
```

`org.apache.spark.scheduler.ReplayListenerSuite`

```
 - Replay compressed inprogress log file succeeding on partial read (156 
milliseconds)
   Exception encountered when attempting to run a suite with class name:
   org.apache.spark.scheduler.ReplayListenerSuite *** ABORTED *** (1 second, 
391 milliseconds)
   java.io.IOException: Failed to delete: 
C:\projects\spark\target\tmp\spark-8f3cacd6-faad-4121-b901-ba1bba8025a0

 - End-to-end replay *** FAILED *** (62 milliseconds)
   java.io.IOException: No FileSystem for scheme: C

 - End-to-end replay with compression *** FAILED *** (110 milliseconds)
   java.io.IOException: No FileSystem for scheme: C
```

`org.apache.spark.sql.hive.StatisticsSuite`

```
 - SPARK-21079 - analyze table with location different than that of individual 
partitions *** FAILED *** (875 milliseconds)
   org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.lang.IllegalArgumentException: Can not create a Path 
from an empty string);

 - SPARK-21079 - analyze partitioned table with only a subset of partitions 
visible *** FAILED *** (47 milliseconds)
   org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.lang.IllegalArgumentException: Can not create a Path 
from an empty string);
```

**Note:** this PR does not fix:

`org.apache.spark.deploy.SparkSubmitSuite`

```
 - launch simple application with spark-submit with redaction *** FAILED *** 
(172 milliseconds)
   java.util.NoSuchElementException: next on empty iterator
```

I can't reproduce this on my Windows machine but looks appearntly consistently 
failed on AppVeyor. This one is unclear to me yet and hard to debug so I did 
not include this one for now.

**Note:** it looks there are more instances but it is hard to identify them 
partly due to flakiness and partly due to swarming logs and errors. Will 
probably go one more time if it is fine.

## How was this patch tested?

Manually via AppVeyor:

**Before**

- `org.apache.spark.deploy.RPackageUtilsSuite`: 
https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/8t8ra3lrljuir7q4
- `org.apache.spark.deploy.SparkSubmitSuite`: 
https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/taquy84yudjjen64
- `org.apache.spark.scheduler.ReplayListenerSuite`: 
https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/24omrfn2k0xfa9xq
- `org.apache.spark.sql.hive.StatisticsSuite`: 
https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/2079y1plgj76dc9l

**After**

- `org.apache.spark.deploy.RPackageUtilsSuite`: 
https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/3803dbfn89ne1164
- `org.apache.spark.deploy.SparkSubmitSuite`: 
https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/m5l350dp7u9a4xjr
- `org.apache.spark.scheduler.ReplayListenerSuite`: 
https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/565vf74pp6bfdk18
- `org.apache.spark.sql.hive.StatisticsSuite`: 
https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/qm78tsk8c37jb6s4

Jenkins tests are required and AppVeyor tests will be triggered.

Author: hyukjinkwon <gurwls...@gmail.com>

Closes #18971 from HyukjinKwon/windows-fixes.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b30a11a6
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b30a11a6
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b30a11a6

Branch: refs/heads/master
Commit: b30a11a6acf4b1512b5759f21ae58e69662ba455
Parents: 734ed7a
Author: hyukjinkwon <gurwls...@gmail.com>
Authored: Wed Aug 30 21:35:52 2017 +0900
Committer: hyukjinkwon <gurwls...@gmail.com>
Committed: Wed Aug 30 21:35:52 2017 +0900

----------------------------------------------------------------------
 .../spark/deploy/RPackageUtilsSuite.scala       |  7 +--
 .../apache/spark/deploy/SparkSubmitSuite.scala  |  4 +-
 .../spark/scheduler/ReplayListenerSuite.scala   | 53 +++++++++++---------
 .../apache/spark/sql/hive/StatisticsSuite.scala |  6 +--
 4 files changed, 39 insertions(+), 31 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala
----------------------------------------------------------------------
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala
index 5e0bf6d..32dd3ec 100644
--- a/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala
@@ -137,9 +137,10 @@ class RPackageUtilsSuite
     IvyTestUtils.withRepository(main, None, None) { repo =>
       val jar = IvyTestUtils.packJar(new File(new URI(repo)), dep1, Nil,
         useIvyLayout = false, withR = false, None)
-      val jarFile = new JarFile(jar)
-      assert(jarFile.getManifest == null, "jar file should have null manifest")
-      assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest should 
return false")
+      Utils.tryWithResource(new JarFile(jar)) { jarFile =>
+        assert(jarFile.getManifest == null, "jar file should have null 
manifest")
+        assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest 
should return false")
+      }
     }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
----------------------------------------------------------------------
diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
index 724096d..7400ceb 100644
--- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
@@ -831,7 +831,7 @@ class SparkSubmitSuite
     val hadoopConf = new Configuration()
     val tmpDir = Files.createTempDirectory("tmp").toFile
     updateConfWithFakeS3Fs(hadoopConf)
-    val sourcePath = s"s3a://${jarFile.getAbsolutePath}"
+    val sourcePath = s"s3a://${jarFile.toURI.getPath}"
     val outputPath = DependencyUtils.downloadFile(sourcePath, tmpDir, 
sparkConf, hadoopConf,
       new SecurityManager(sparkConf))
     checkDownloadedFile(sourcePath, outputPath)
@@ -847,7 +847,7 @@ class SparkSubmitSuite
     val hadoopConf = new Configuration()
     val tmpDir = Files.createTempDirectory("tmp").toFile
     updateConfWithFakeS3Fs(hadoopConf)
-    val sourcePaths = Seq("/local/file", s"s3a://${jarFile.getAbsolutePath}")
+    val sourcePaths = Seq("/local/file", s"s3a://${jarFile.toURI.getPath}")
     val outputPaths = DependencyUtils
       .downloadFileList(sourcePaths.mkString(","), tmpDir, sparkConf, 
hadoopConf,
         new SecurityManager(sparkConf))

http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala
----------------------------------------------------------------------
diff --git 
a/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala 
b/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala
index 88a68af..d17e386 100644
--- a/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala
+++ b/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala
@@ -21,6 +21,7 @@ import java.io._
 import java.net.URI
 import java.util.concurrent.atomic.AtomicInteger
 
+import org.apache.hadoop.fs.Path
 import org.json4s.jackson.JsonMethods._
 import org.scalatest.BeforeAndAfter
 
@@ -84,24 +85,23 @@ class ReplayListenerSuite extends SparkFunSuite with 
BeforeAndAfter with LocalSp
     val buffered = new ByteArrayOutputStream
     val codec = new LZ4CompressionCodec(new SparkConf())
     val compstream = codec.compressedOutputStream(buffered)
-    val writer = new PrintWriter(compstream)
+    Utils.tryWithResource(new PrintWriter(compstream)) { writer =>
 
-    val applicationStart = SparkListenerApplicationStart("AppStarts", None,
-      125L, "Mickey", None)
-    val applicationEnd = SparkListenerApplicationEnd(1000L)
+      val applicationStart = SparkListenerApplicationStart("AppStarts", None,
+        125L, "Mickey", None)
+      val applicationEnd = SparkListenerApplicationEnd(1000L)
 
-    // scalastyle:off println
-    
writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationStart))))
-    
writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationEnd))))
-    // scalastyle:on println
-    writer.close()
+      // scalastyle:off println
+      
writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationStart))))
+      
writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationEnd))))
+      // scalastyle:on println
+    }
 
     val logFilePath = Utils.getFilePath(testDir, "events.lz4.inprogress")
-    val fstream = fileSystem.create(logFilePath)
     val bytes = buffered.toByteArray
-
-    fstream.write(bytes, 0, buffered.size)
-    fstream.close
+    Utils.tryWithResource(fileSystem.create(logFilePath)) { fstream =>
+      fstream.write(bytes, 0, buffered.size)
+    }
 
     // Read the compressed .inprogress file and verify only first event was 
parsed.
     val conf = EventLoggingListenerSuite.getLoggingConf(logFilePath)
@@ -112,17 +112,19 @@ class ReplayListenerSuite extends SparkFunSuite with 
BeforeAndAfter with LocalSp
 
     // Verify the replay returns the events given the input maybe truncated.
     val logData = EventLoggingListener.openEventLog(logFilePath, fileSystem)
-    val failingStream = new EarlyEOFInputStream(logData, buffered.size - 10)
-    replayer.replay(failingStream, logFilePath.toString, true)
+    Utils.tryWithResource(new EarlyEOFInputStream(logData, buffered.size - 
10)) { failingStream =>
+      replayer.replay(failingStream, logFilePath.toString, true)
 
-    assert(eventMonster.loggedEvents.size === 1)
-    assert(failingStream.didFail)
+      assert(eventMonster.loggedEvents.size === 1)
+      assert(failingStream.didFail)
+    }
 
     // Verify the replay throws the EOF exception since the input may not be 
truncated.
     val logData2 = EventLoggingListener.openEventLog(logFilePath, fileSystem)
-    val failingStream2 = new EarlyEOFInputStream(logData2, buffered.size - 10)
-    intercept[EOFException] {
-      replayer.replay(failingStream2, logFilePath.toString, false)
+    Utils.tryWithResource(new EarlyEOFInputStream(logData2, buffered.size - 
10)) { failingStream2 =>
+      intercept[EOFException] {
+        replayer.replay(failingStream2, logFilePath.toString, false)
+      }
     }
   }
 
@@ -151,7 +153,10 @@ class ReplayListenerSuite extends SparkFunSuite with 
BeforeAndAfter with LocalSp
    * assumption that the event logging behavior is correct (tested in a 
separate suite).
    */
   private def testApplicationReplay(codecName: Option[String] = None) {
-    val logDirPath = Utils.getFilePath(testDir, "test-replay")
+    val logDir = new File(testDir.getAbsolutePath, "test-replay")
+    // Here, it creates `Path` from the URI instead of the absolute path for 
the explicit file
+    // scheme so that the string representation of this `Path` has leading 
file scheme correctly.
+    val logDirPath = new Path(logDir.toURI)
     fileSystem.mkdirs(logDirPath)
 
     val conf = EventLoggingListenerSuite.getLoggingConf(logDirPath, codecName)
@@ -221,12 +226,14 @@ class ReplayListenerSuite extends SparkFunSuite with 
BeforeAndAfter with LocalSp
     def didFail: Boolean = countDown.get == 0
 
     @throws[IOException]
-    def read: Int = {
+    override def read(): Int = {
       if (countDown.get == 0) {
         throw new EOFException("Stream ended prematurely")
       }
       countDown.decrementAndGet()
-      in.read
+      in.read()
     }
+
+    override def close(): Unit = in.close()
   }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
----------------------------------------------------------------------
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
index dc61407..03e50e4 100644
--- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
+++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
@@ -203,7 +203,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase 
with TestHiveSingleto
           sql(s"INSERT INTO TABLE $tableName PARTITION (ds='$ds') SELECT * 
FROM src")
         }
 
-        sql(s"ALTER TABLE $tableName SET LOCATION '$path'")
+        sql(s"ALTER TABLE $tableName SET LOCATION '${path.toURI}'")
 
         sql(s"ANALYZE TABLE $tableName COMPUTE STATISTICS noscan")
 
@@ -222,7 +222,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase 
with TestHiveSingleto
             s"""
                |CREATE TABLE $sourceTableName (key STRING, value STRING)
                |PARTITIONED BY (ds STRING)
-               |LOCATION '$path'
+               |LOCATION '${path.toURI}'
              """.stripMargin)
 
           val partitionDates = List("2010-01-01", "2010-01-02", "2010-01-03")
@@ -239,7 +239,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase 
with TestHiveSingleto
             s"""
                |CREATE TABLE $tableName (key STRING, value STRING)
                |PARTITIONED BY (ds STRING)
-               |LOCATION '$path'
+               |LOCATION '${path.toURI}'
              """.stripMargin)
 
           // Register only one of the partitions found on disk


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths

Reply via email to