wecharyu commented on code in PR #48468:
URL: https://github.com/apache/arrow/pull/48468#discussion_r2747118575
##########
cpp/src/parquet/file_writer.cc:
##########
@@ -68,6 +68,12 @@ int64_t RowGroupWriter::total_compressed_bytes_written()
const {
return contents_->total_compressed_bytes_written();
}
+int64_t RowGroupWriter::EstimatedTotalCompressedBytes() const {
+ return contents_->total_compressed_bytes() +
+ contents_->total_compressed_bytes_written() +
+ contents_->EstimatedBufferedValueBytes();
Review Comment:
I prefer 3 too, this is also the approach taken by parquet-java. And it
makes the final row group size smaller than `max_row_group_bytes`, which is
intuitive.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]