alamb commented on code in PR #6216: URL: https://github.com/apache/arrow-rs/pull/6216#discussion_r1718593954
########## parquet/src/file/statistics.rs: ########## @@ -246,11 +245,7 @@ pub fn to_thrift(stats: Option<&Statistics>) -> Option<TStatistics> { let mut thrift_stats = TStatistics { max: None, min: None, - null_count: if stats.has_nulls() { - Some(stats.null_count() as i64) - } else { - None - }, + null_count: stats.null_count_opt().map(|value| value as i64), Review Comment: I agree the new behavior is desired, but I think it changes what values are written to parquet files (specifically the parquet metadata will now have the thrift equivalent of `Some(0)` rather than the equivalent of `None`. I filed https://github.com/apache/arrow-rs/issues/6256 to track As this PR is already quite large, I think we should split it into two parts: 1. The API changes 2. The change for writing the metadata I plan to update this PR to revert the changes to the metadata writing, and will then make a follow on PR to discuss / propose changing the statistics that are written to the file -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org