alamb commented on code in PR #6216:
URL: https://github.com/apache/arrow-rs/pull/6216#discussion_r1718593954


##########
parquet/src/file/statistics.rs:
##########
@@ -246,11 +245,7 @@ pub fn to_thrift(stats: Option<&Statistics>) -> 
Option<TStatistics> {
     let mut thrift_stats = TStatistics {
         max: None,
         min: None,
-        null_count: if stats.has_nulls() {
-            Some(stats.null_count() as i64)
-        } else {
-            None
-        },
+        null_count: stats.null_count_opt().map(|value| value as i64),

Review Comment:
   I agree the new behavior is desired, but I think it changes what values are 
written to parquet files (specifically the parquet metadata will now have the 
thrift equivalent of `Some(0)` rather than the equivalent of `None`. I filed 
https://github.com/apache/arrow-rs/issues/6256 to track
   
   As this PR is already quite large, I think we should split it into two parts:
   1. The API changes
   2. The change for writing the metadata
   
   I plan to update this PR to revert the changes to the metadata writing, and 
will then make a follow on PR to discuss / propose changing the statistics that 
are written to the file
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to