Repository: spark Updated Branches: refs/heads/master 734ed7a7b -> b30a11a6a
[SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths ## What changes were proposed in this pull request? `org.apache.spark.deploy.RPackageUtilsSuite` ``` - jars without manifest return false *** FAILED *** (109 milliseconds) java.io.IOException: Unable to delete file: C:\projects\spark\target\tmp\1500266936418-0\dep1-c.jar ``` `org.apache.spark.deploy.SparkSubmitSuite` ``` - download one file to local *** FAILED *** (16 milliseconds) java.net.URISyntaxException: Illegal character in authority at index 6: s3a://C:\projects\spark\target\tmp\test2630198944759847458.jar - download list of files to local *** FAILED *** (0 milliseconds) java.net.URISyntaxException: Illegal character in authority at index 6: s3a://C:\projects\spark\target\tmp\test2783551769392880031.jar ``` `org.apache.spark.scheduler.ReplayListenerSuite` ``` - Replay compressed inprogress log file succeeding on partial read (156 milliseconds) Exception encountered when attempting to run a suite with class name: org.apache.spark.scheduler.ReplayListenerSuite *** ABORTED *** (1 second, 391 milliseconds) java.io.IOException: Failed to delete: C:\projects\spark\target\tmp\spark-8f3cacd6-faad-4121-b901-ba1bba8025a0 - End-to-end replay *** FAILED *** (62 milliseconds) java.io.IOException: No FileSystem for scheme: C - End-to-end replay with compression *** FAILED *** (110 milliseconds) java.io.IOException: No FileSystem for scheme: C ``` `org.apache.spark.sql.hive.StatisticsSuite` ``` - SPARK-21079 - analyze table with location different than that of individual partitions *** FAILED *** (875 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); - SPARK-21079 - analyze partitioned table with only a subset of partitions visible *** FAILED *** (47 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ``` **Note:** this PR does not fix: `org.apache.spark.deploy.SparkSubmitSuite` ``` - launch simple application with spark-submit with redaction *** FAILED *** (172 milliseconds) java.util.NoSuchElementException: next on empty iterator ``` I can't reproduce this on my Windows machine but looks appearntly consistently failed on AppVeyor. This one is unclear to me yet and hard to debug so I did not include this one for now. **Note:** it looks there are more instances but it is hard to identify them partly due to flakiness and partly due to swarming logs and errors. Will probably go one more time if it is fine. ## How was this patch tested? Manually via AppVeyor: **Before** - `org.apache.spark.deploy.RPackageUtilsSuite`: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/8t8ra3lrljuir7q4 - `org.apache.spark.deploy.SparkSubmitSuite`: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/taquy84yudjjen64 - `org.apache.spark.scheduler.ReplayListenerSuite`: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/24omrfn2k0xfa9xq - `org.apache.spark.sql.hive.StatisticsSuite`: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/2079y1plgj76dc9l **After** - `org.apache.spark.deploy.RPackageUtilsSuite`: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/3803dbfn89ne1164 - `org.apache.spark.deploy.SparkSubmitSuite`: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/m5l350dp7u9a4xjr - `org.apache.spark.scheduler.ReplayListenerSuite`: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/565vf74pp6bfdk18 - `org.apache.spark.sql.hive.StatisticsSuite`: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/qm78tsk8c37jb6s4 Jenkins tests are required and AppVeyor tests will be triggered. Author: hyukjinkwon <gurwls...@gmail.com> Closes #18971 from HyukjinKwon/windows-fixes. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b30a11a6 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b30a11a6 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b30a11a6 Branch: refs/heads/master Commit: b30a11a6acf4b1512b5759f21ae58e69662ba455 Parents: 734ed7a Author: hyukjinkwon <gurwls...@gmail.com> Authored: Wed Aug 30 21:35:52 2017 +0900 Committer: hyukjinkwon <gurwls...@gmail.com> Committed: Wed Aug 30 21:35:52 2017 +0900 ---------------------------------------------------------------------- .../spark/deploy/RPackageUtilsSuite.scala | 7 +-- .../apache/spark/deploy/SparkSubmitSuite.scala | 4 +- .../spark/scheduler/ReplayListenerSuite.scala | 53 +++++++++++--------- .../apache/spark/sql/hive/StatisticsSuite.scala | 6 +-- 4 files changed, 39 insertions(+), 31 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala ---------------------------------------------------------------------- diff --git a/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala b/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala index 5e0bf6d..32dd3ec 100644 --- a/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala @@ -137,9 +137,10 @@ class RPackageUtilsSuite IvyTestUtils.withRepository(main, None, None) { repo => val jar = IvyTestUtils.packJar(new File(new URI(repo)), dep1, Nil, useIvyLayout = false, withR = false, None) - val jarFile = new JarFile(jar) - assert(jarFile.getManifest == null, "jar file should have null manifest") - assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest should return false") + Utils.tryWithResource(new JarFile(jar)) { jarFile => + assert(jarFile.getManifest == null, "jar file should have null manifest") + assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest should return false") + } } } http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ---------------------------------------------------------------------- diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala index 724096d..7400ceb 100644 --- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala @@ -831,7 +831,7 @@ class SparkSubmitSuite val hadoopConf = new Configuration() val tmpDir = Files.createTempDirectory("tmp").toFile updateConfWithFakeS3Fs(hadoopConf) - val sourcePath = s"s3a://${jarFile.getAbsolutePath}" + val sourcePath = s"s3a://${jarFile.toURI.getPath}" val outputPath = DependencyUtils.downloadFile(sourcePath, tmpDir, sparkConf, hadoopConf, new SecurityManager(sparkConf)) checkDownloadedFile(sourcePath, outputPath) @@ -847,7 +847,7 @@ class SparkSubmitSuite val hadoopConf = new Configuration() val tmpDir = Files.createTempDirectory("tmp").toFile updateConfWithFakeS3Fs(hadoopConf) - val sourcePaths = Seq("/local/file", s"s3a://${jarFile.getAbsolutePath}") + val sourcePaths = Seq("/local/file", s"s3a://${jarFile.toURI.getPath}") val outputPaths = DependencyUtils .downloadFileList(sourcePaths.mkString(","), tmpDir, sparkConf, hadoopConf, new SecurityManager(sparkConf)) http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala ---------------------------------------------------------------------- diff --git a/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala b/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala index 88a68af..d17e386 100644 --- a/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala +++ b/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala @@ -21,6 +21,7 @@ import java.io._ import java.net.URI import java.util.concurrent.atomic.AtomicInteger +import org.apache.hadoop.fs.Path import org.json4s.jackson.JsonMethods._ import org.scalatest.BeforeAndAfter @@ -84,24 +85,23 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with LocalSp val buffered = new ByteArrayOutputStream val codec = new LZ4CompressionCodec(new SparkConf()) val compstream = codec.compressedOutputStream(buffered) - val writer = new PrintWriter(compstream) + Utils.tryWithResource(new PrintWriter(compstream)) { writer => - val applicationStart = SparkListenerApplicationStart("AppStarts", None, - 125L, "Mickey", None) - val applicationEnd = SparkListenerApplicationEnd(1000L) + val applicationStart = SparkListenerApplicationStart("AppStarts", None, + 125L, "Mickey", None) + val applicationEnd = SparkListenerApplicationEnd(1000L) - // scalastyle:off println - writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationStart)))) - writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationEnd)))) - // scalastyle:on println - writer.close() + // scalastyle:off println + writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationStart)))) + writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationEnd)))) + // scalastyle:on println + } val logFilePath = Utils.getFilePath(testDir, "events.lz4.inprogress") - val fstream = fileSystem.create(logFilePath) val bytes = buffered.toByteArray - - fstream.write(bytes, 0, buffered.size) - fstream.close + Utils.tryWithResource(fileSystem.create(logFilePath)) { fstream => + fstream.write(bytes, 0, buffered.size) + } // Read the compressed .inprogress file and verify only first event was parsed. val conf = EventLoggingListenerSuite.getLoggingConf(logFilePath) @@ -112,17 +112,19 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with LocalSp // Verify the replay returns the events given the input maybe truncated. val logData = EventLoggingListener.openEventLog(logFilePath, fileSystem) - val failingStream = new EarlyEOFInputStream(logData, buffered.size - 10) - replayer.replay(failingStream, logFilePath.toString, true) + Utils.tryWithResource(new EarlyEOFInputStream(logData, buffered.size - 10)) { failingStream => + replayer.replay(failingStream, logFilePath.toString, true) - assert(eventMonster.loggedEvents.size === 1) - assert(failingStream.didFail) + assert(eventMonster.loggedEvents.size === 1) + assert(failingStream.didFail) + } // Verify the replay throws the EOF exception since the input may not be truncated. val logData2 = EventLoggingListener.openEventLog(logFilePath, fileSystem) - val failingStream2 = new EarlyEOFInputStream(logData2, buffered.size - 10) - intercept[EOFException] { - replayer.replay(failingStream2, logFilePath.toString, false) + Utils.tryWithResource(new EarlyEOFInputStream(logData2, buffered.size - 10)) { failingStream2 => + intercept[EOFException] { + replayer.replay(failingStream2, logFilePath.toString, false) + } } } @@ -151,7 +153,10 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with LocalSp * assumption that the event logging behavior is correct (tested in a separate suite). */ private def testApplicationReplay(codecName: Option[String] = None) { - val logDirPath = Utils.getFilePath(testDir, "test-replay") + val logDir = new File(testDir.getAbsolutePath, "test-replay") + // Here, it creates `Path` from the URI instead of the absolute path for the explicit file + // scheme so that the string representation of this `Path` has leading file scheme correctly. + val logDirPath = new Path(logDir.toURI) fileSystem.mkdirs(logDirPath) val conf = EventLoggingListenerSuite.getLoggingConf(logDirPath, codecName) @@ -221,12 +226,14 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with LocalSp def didFail: Boolean = countDown.get == 0 @throws[IOException] - def read: Int = { + override def read(): Int = { if (countDown.get == 0) { throw new EOFException("Stream ended prematurely") } countDown.decrementAndGet() - in.read + in.read() } + + override def close(): Unit = in.close() } } http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala ---------------------------------------------------------------------- diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala index dc61407..03e50e4 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala @@ -203,7 +203,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto sql(s"INSERT INTO TABLE $tableName PARTITION (ds='$ds') SELECT * FROM src") } - sql(s"ALTER TABLE $tableName SET LOCATION '$path'") + sql(s"ALTER TABLE $tableName SET LOCATION '${path.toURI}'") sql(s"ANALYZE TABLE $tableName COMPUTE STATISTICS noscan") @@ -222,7 +222,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto s""" |CREATE TABLE $sourceTableName (key STRING, value STRING) |PARTITIONED BY (ds STRING) - |LOCATION '$path' + |LOCATION '${path.toURI}' """.stripMargin) val partitionDates = List("2010-01-01", "2010-01-02", "2010-01-03") @@ -239,7 +239,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto s""" |CREATE TABLE $tableName (key STRING, value STRING) |PARTITIONED BY (ds STRING) - |LOCATION '$path' + |LOCATION '${path.toURI}' """.stripMargin) // Register only one of the partitions found on disk --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org