brunal commented on code in PR #16342: URL: https://github.com/apache/datafusion/pull/16342#discussion_r2185262228
########## datafusion/datasource/src/file_sink_config.rs: ########## @@ -77,13 +79,34 @@ pub trait FileSink: DataSink { .runtime_env() .object_store(&config.object_store_url)?; let (demux_task, file_stream_rx) = start_demuxer_task(config, data, context); - self.spawn_writer_tasks_and_join( - context, - demux_task, - file_stream_rx, - object_store, - ) - .await + let mut num_rows = self + .spawn_writer_tasks_and_join( + context, + demux_task, + file_stream_rx, + Arc::clone(&object_store), + ) + .await?; + if num_rows == 0 { + // If no rows were written, then no files are output either. Review Comment: You say now row => no file was created. But then you say write an empty recordbatch => ensure a file gets created. Except an empty recordbatch has no rows (at least when written to a parquet file). Your 2 sentences don't make sense together. In practice, this PR caused a regression: we cannot write empty recordbatch to parquet anymore, as the code here tries to write it a second time, and we get an error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org