Maxim Gekk created SPARK-26081: ---------------------------------- Summary: Do not write empty files by text datasources Key: SPARK-26081 URL: https://issues.apache.org/jira/browse/SPARK-26081 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.0 Reporter: Maxim Gekk
Text based datasources like CSV, JSON and Text produces empty files for empty partitions. This introduces additional overhead while opening and reading such files back. In current implementation of OutputWriter, the output stream are created eagerly even no records are written to the stream. So, creation can be postponed up to the first write. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org