GitHub user MaxGekk opened a pull request: https://github.com/apache/spark/pull/23052
[SPARK-26081][SQL] Prevent empty files for empty partitions in Text datasources ## What changes were proposed in this pull request? In the PR, I propose to postpone creation of `OutputStream`/`Univocity`/`JacksonGenerator` till the first row should be written. This prevents creation of empty files for empty partitions. So, no need to open and to read such files back while loading data from the location. ## How was this patch tested? Added tests for Text, JSON and CSV datasource where empty dataset is written but should not produce any files. You can merge this pull request into a Git repository by running: $ git pull https://github.com/MaxGekk/spark-1 text-empty-files Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23052.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23052 ---- commit 3efa7b615f7c37538edb0afca30d4f300ac07aee Author: Maxim Gekk <max.gekk@...> Date: 2018-11-15T19:44:47Z Added a test for text datasource commit 80aadf645ab63885ce6f43ac74b0c02871e10883 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-15T20:11:00Z Creating output stream on the first write commit 0a774ef9e4de987c9f3073b90396215b9f04ca16 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-15T20:20:27Z Test for csv commit 47b71b7a235ffcdfa79753307f1afcb377a17977 Author: Maxim Gekk <max.gekk@...> Date: 2018-11-15T20:21:06Z Don't produce empty CSV files commit 040c71f8ea49ca10160cfa242095d6ebd2d76a8d Author: Maxim Gekk <max.gekk@...> Date: 2018-11-15T20:22:23Z Test for JSON commit 6f3cb18d5a863f6aded763bdeb5395f6622876ff Author: Maxim Gekk <max.gekk@...> Date: 2018-11-15T20:32:32Z Do not produce empty JSON files ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org