Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19464#discussion_r144683771 --- Diff: core/src/test/scala/org/apache/spark/FileSuite.scala --- @@ -510,4 +510,87 @@ class FileSuite extends SparkFunSuite with LocalSparkContext { } } + test("spark.files.ignoreEmptySplits work correctly (old Hadoop API)") { + val conf = new SparkConf() + conf.setAppName("test").setMaster("local").set(IGNORE_EMPTY_SPLITS, true) + sc = new SparkContext(conf) + + def testIgnoreEmptySplits( + data: Array[Tuple2[String, String]], + actualPartitionNum: Int, + expectedPart: String, + expectedPartitionNum: Int): Unit = { + val output = new File(tempDir, "output") + sc.parallelize(data, actualPartitionNum) + .saveAsHadoopFile[TextOutputFormat[String, String]](output.getPath) + assert(new File(output, expectedPart).exists() === true) + val hadoopRDD = sc.textFile(new File(output, "part-*").getPath) + assert(hadoopRDD.partitions.length === expectedPartitionNum) + Utils.deleteRecursively(output) --- End diff -- I think we don't need `try... finally` here. Because `Utils.deleteRecursively(output)` just to ensure the success of next invocation of the `testIgnoreEmptySplits`. When test finished, wether be passed or not, the `tempDir` will be deleted in `FileSuite.afterEach()`.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org