[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165579611 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -92,4 +96,37 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext { } } } + + allFileBasedDataSources.foreach { format => --- End diff -- Thank you so much, @HyukjinKwon ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165578302 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -92,4 +96,39 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext { } } } + + // Only ORC/Parquet support this. --- End diff -- Ur, let me check back. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165577600 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -92,4 +96,39 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext { } } } + + // Only ORC/Parquet support this. --- End diff -- Wait .. don't other sources support this option? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165576607 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -92,4 +96,39 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext { } } } + + // Only ORC/Parquet support this. + Seq("orc", "parquet").foreach { format => --- End diff -- Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165575853 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -92,4 +96,39 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext { } } } + + // Only ORC/Parquet support this. + Seq("orc", "parquet").foreach { format => --- End diff -- Ah, yeah. This sounds better. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165559636 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala --- @@ -655,4 +655,35 @@ class OrcQuerySuite extends OrcQueryTest with SharedSQLContext { } } } + + testQuietly("Enabling/disabling ignoreMissingFiles") { --- End diff -- +1. Which suite is proper for that base test class? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165557435 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala --- @@ -655,4 +655,35 @@ class OrcQuerySuite extends OrcQueryTest with SharedSQLContext { } } } + + testQuietly("Enabling/disabling ignoreMissingFiles") { --- End diff -- Basically, this is copied from Parquet. To avoid duplicate codes, create a common base test class for parquet and orc? Then, we can deduplicate the codes? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165537214 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala --- @@ -655,4 +655,35 @@ class OrcQuerySuite extends OrcQueryTest with SharedSQLContext { } } } + + testQuietly("Enabling/disabling ignoreMissingFiles") { +def testIgnoreMissingFiles(): Unit = { + withTempDir { dir => +val basePath = dir.getCanonicalPath +spark.range(1).toDF("a").write.orc(new Path(basePath, "first").toString) +spark.range(1, 2).toDF("a").write.orc(new Path(basePath, "second").toString) +val thirdPath = new Path(basePath, "third") +spark.range(2, 3).toDF("a").write.orc(thirdPath.toString) +val df = spark.read.orc( + new Path(basePath, "first").toString, + new Path(basePath, "second").toString, + new Path(basePath, "third").toString) + +val fs = thirdPath.getFileSystem(spark.sparkContext.hadoopConfiguration) +fs.delete(thirdPath, true) --- End diff -- Sure. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20479#discussion_r165528709 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala --- @@ -655,4 +655,35 @@ class OrcQuerySuite extends OrcQueryTest with SharedSQLContext { } } } + + testQuietly("Enabling/disabling ignoreMissingFiles") { +def testIgnoreMissingFiles(): Unit = { + withTempDir { dir => +val basePath = dir.getCanonicalPath +spark.range(1).toDF("a").write.orc(new Path(basePath, "first").toString) +spark.range(1, 2).toDF("a").write.orc(new Path(basePath, "second").toString) +val thirdPath = new Path(basePath, "third") +spark.range(2, 3).toDF("a").write.orc(thirdPath.toString) +val df = spark.read.orc( + new Path(basePath, "first").toString, + new Path(basePath, "second").toString, + new Path(basePath, "third").toString) + +val fs = thirdPath.getFileSystem(spark.sparkContext.hadoopConfiguration) +fs.delete(thirdPath, true) --- End diff -- Shall we assert `true`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20479: [SPARK-23305][SQL][TEST] Add `spark.sql.files.ign...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/20479 [SPARK-23305][SQL][TEST] Add `spark.sql.files.ignoreMissingFiles` test case for ORC ## What changes were proposed in this pull request? Like Parquet, Apache Spark ORC already handles `spark.sql.files.ignoreMissingFiles` correctly. We had better have a test coverage for feature parity. ## How was this patch tested? Pass Jenkins with a newly added test case. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-23305 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20479.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20479 commit 7fdfc35e5fc0f92b4bf2f687606bb48ad6fa68e6 Author: Dongjoon Hyun Date: 2018-02-01T18:23:00Z [SPARK-23305][SQL][TEST] Add `spark.sql.files.ignoreMissingFiles` test case for ORC --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org