alamb commented on code in PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#discussion_r2676536711
##########
parquet/src/arrow/array_reader/struct_array.rs:
##########
@@ -124,16 +124,15 @@ impl ArrayReader for StructArrayReader {
return Err(general_err!("Not all children array length are the
same!"));
}
- // Now we can build array data
- let mut array_data_builder =
ArrayDataBuilder::new(self.data_type.clone())
- .len(children_array_len)
- .child_data(
- children_array
- .into_iter()
- .map(|x| x.into_data())
Review Comment:
converting the child arrays into ArrayData is wasteful for at least 2
reasons:
1. They are just converted back to ArrayRefs below
2. Each array data has at least one new allocation (the Vec of buffers)
##########
parquet/src/arrow/array_reader/struct_array.rs:
##########
@@ -169,11 +166,18 @@ impl ArrayReader for StructArrayReader {
return Err(general_err!("Failed to decode level data for
struct array"));
}
- array_data_builder =
array_data_builder.null_bit_buffer(Some(bitmap_builder.into()));
+ nulls = Some(NullBuffer::new(bitmap_builder.finish()));
Review Comment:
NullBuffer::new counts the set bits, but so do the existing code paths
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]