etseidl commented on code in PR #8098:
URL: https://github.com/apache/arrow-rs/pull/8098#discussion_r2282665984
##########
parquet/src/file/properties.rs:
##########
@@ -665,7 +667,7 @@ impl WriterPropertiesBuilder {
assert!(value > 0, "Cannot have a 0 statistics truncate length. If
you wish to disable min/max value truncation, set it to `None`.");
}
- self.statistics_truncate_length = max_length;
+ self.default_column_properties.statistics_truncate_length =
Some(max_length);
Review Comment:
You should also add a setter to the column properties impl rather than
assigning it directly here.
##########
parquet/src/file/properties.rs:
##########
@@ -1155,6 +1177,11 @@ impl ColumnProperties {
fn bloom_filter_properties(&self) -> Option<&BloomFilterProperties> {
self.bloom_filter_properties.as_ref()
}
+
+ /// Returns the statistics truncate length for this column.
+ fn statistics_truncate_length(&self) -> Option<Option<usize>> {
Review Comment:
This makes my head hurt 😅. I assume the fear here is that the `Default`
implementation for `ColumnProperties` won't set the truncation length to the
default value, but will instead leave it `None` in the case that a different
property is set for the column. So the double layer of option is to distinguish
between `None` because that's what the user set vs `None` because the option
hasn't been set.
Perhaps we should just implement a sensible `Default` for `ColumnProperties?`
##########
parquet/src/file/properties.rs:
##########
@@ -665,7 +667,7 @@ impl WriterPropertiesBuilder {
assert!(value > 0, "Cannot have a 0 statistics truncate length. If
you wish to disable min/max value truncation, set it to `None`.");
}
- self.statistics_truncate_length = max_length;
+ self.default_column_properties.statistics_truncate_length =
Some(max_length);
Review Comment:
Can this be moved down below line 709 so it's with the other per-column
properties?
```
// Setters for any column (global)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]