etseidl commented on code in PR #9619:
URL: https://github.com/apache/arrow-rs/pull/9619#discussion_r3314601355


##########
parquet/src/column/writer/mod.rs:
##########
@@ -900,6 +904,27 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, E> 
{
         Ok(())
     }
 
+    // For float columns, always provide Some(n), even if n is 0
+    // For non-float columns, always provide None
+    fn get_nan_count<T: ParquetValueType>(&self) -> Option<i64> {
+        let nan_count = || {
+            let nan_count = self.page_metrics.num_page_nans.unwrap_or(0);
+            match i64::try_from(nan_count) {
+                Ok(count) => Some(count),
+                _ => Some(i64::MAX),

Review Comment:
   I wasn't quite sure what to do here, but I can't imagine overflowing an i64. 
Probably any non-zero number would work since most use cases I can think of 
only need to know if NaN is present, not the absolute count. 
   
   Alternatively we could change the signature of `update_column_offset_index` 
to return `Result<()>` and return the conversion error here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to