alamb commented on code in PR #6216:
URL: https://github.com/apache/arrow-rs/pull/6216#discussion_r1715903094


##########
parquet/tests/arrow_writer_layout.rs:
##########
@@ -189,7 +189,7 @@ fn test_primitive() {
                     pages: (0..8)
                         .map(|_| Page {
                             rows: 250,
-                            page_header_size: 36,
+                            page_header_size: 38,

Review Comment:
   why are these changes needed?



##########
parquet/src/file/statistics.rs:
##########
@@ -385,18 +380,13 @@ impl Statistics {
 
     /// Returns number of null values for the column.
     /// Note that this includes all nulls when column is part of the complex 
type.

Review Comment:
   ```suggestion
       /// Returns number of null values for the column, if known.
       /// Note that this includes all nulls when column is part of the complex 
type.
   ```



##########
parquet/src/file/statistics.rs:
##########
@@ -538,37 +528,47 @@ impl<T: ParquetValueType> ValueStatistics<T> {
     ///
     /// Panics if min value is not set, e.g. all values are `null`.
     /// Use `has_min_max_set` method to check that.
-    pub fn min(&self) -> &T {
+    pub(crate) fn min_unchecked(&self) -> &T {
         self.min.as_ref().unwrap()
     }
 
+    /// Returns min value of the statistics.

Review Comment:
   ```suggestion
       /// Returns min value of the statistics, if known.
   ```



##########
parquet/src/column/writer/mod.rs:
##########
@@ -769,8 +769,8 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, E> {
                 }
                 Some(stat) => {
                     // Check if min/max are still ascending/descending across 
pages
-                    let new_min = stat.min();
-                    let new_max = stat.max();
+                    let new_min = stat.min_unchecked();

Review Comment:
   I think it is guaranteed to be set at this point (this is the code that is 
responsible for writing the statistics)
   



##########
parquet/src/file/statistics.rs:
##########
@@ -601,6 +601,7 @@ impl<T: ParquetValueType> ValueStatistics<T> {
     /// using signed comparison. This resulted in an undefined ordering for 
unsigned
     /// quantities, such as booleans and unsigned integers.
     ///
+    // TODO: I don't see `min_value` or `max_value` in this struct. Should 
these be something else?

Review Comment:
   I don't think we should worry about making a nicer API for the older 
versions of parquet. We can do so if/when someone has such a file and a real 
usecase



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to