jhorstmann commented on code in PR #9653:
URL: https://github.com/apache/arrow-rs/pull/9653#discussion_r3045683398
##########
parquet/src/arrow/arrow_writer/levels.rs:
##########
@@ -751,6 +825,11 @@ pub(crate) struct ArrayLevels {
/// cached logical nulls of the array.
logical_nulls: Option<NullBuffer>,
+
+ /// When set, all def/rep levels are a single repeated value and the
+ /// Vec fields above are empty. Tuple: (def_value, rep_value, count).
+ /// This avoids materializing large Vecs for entirely-null columns.
+ uniform_levels: Option<(i16, i16, usize)>,
Review Comment:
I wonder if the logic around these optionals and
`extend_uniform_null_levels` could be made clearer with an enum. The `None`
case for `def_levels`/`rep_levels` also seems similar to a uniform value of
`0`. So maybe it could look something like
```
enum LevelData {
Vec(Vec<i16>),
Uniform(i16, usize),
}
struct ArrayLevels {
def_levels: LevelData,
rep_levels: LevelData,
...
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]