friendlymatthew commented on code in PR #8839:
URL: https://github.com/apache/arrow-rs/pull/8839#discussion_r2557955258
##########
arrow-row/src/lib.rs:
##########
@@ -1762,6 +1907,110 @@ unsafe fn decode_column(
},
_ => unreachable!(),
},
+ Codec::Union(converters, null_rows, _mode) => {
+ let len = rows.len();
+
+ let DataType::Union(union_fields, mode) = &field.data_type else {
+ unreachable!()
+ };
+
+ let mut type_ids = Vec::with_capacity(len);
+ let mut rows_by_field: Vec<Vec<(usize, &[u8])>> = vec![Vec::new();
converters.len()];
+
+ for (idx, row) in rows.iter_mut().enumerate() {
+ // skip the null sentinel
+ let mut cursor = 1;
Review Comment:
Hm, I thought union arrays don't support physical nulls. i.e.
`UnionArray:nulls()` always return `None`
https://docs.rs/arrow-array/57.0.0/src/arrow_array/array/union_array.rs.html#777
Nulls in union arrays are represented by nulls in the child arrays, not by a
physical null buffer on the union itself. For example, consider an element at
index 1 with type id 0 pointing to a null in the Int32 child array. Then the
union element itself is not null, it's a valid union element that happens to
point to a null value
Now that I think about it, I wonder if we can just eagerly encode `0x01` as
the null sentinel byte 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]