friendlymatthew commented on code in PR #8839:
URL: https://github.com/apache/arrow-rs/pull/8839#discussion_r2557955258


##########
arrow-row/src/lib.rs:
##########
@@ -1762,6 +1907,110 @@ unsafe fn decode_column(
             },
             _ => unreachable!(),
         },
+        Codec::Union(converters, null_rows, _mode) => {
+            let len = rows.len();
+
+            let DataType::Union(union_fields, mode) = &field.data_type else {
+                unreachable!()
+            };
+
+            let mut type_ids = Vec::with_capacity(len);
+            let mut rows_by_field: Vec<Vec<(usize, &[u8])>> = vec![Vec::new(); 
converters.len()];
+
+            for (idx, row) in rows.iter_mut().enumerate() {
+                // skip the null sentinel
+                let mut cursor = 1;

Review Comment:
   Hm, I thought union arrays don't support physical nulls. i.e. 
`UnionArray:nulls()` always return `None` 
https://docs.rs/arrow-array/57.0.0/src/arrow_array/array/union_array.rs.html#777
   
   Nulls in union arrays are represented by nulls in the child arrays, not by a 
physical null buffer on the union itself. For example, consider an element at 
index 1 with type id 0 pointing to a null in the Int32 child array. Then the 
union element itself is not null, it's a valid union element that happens to 
point to a null value
   
   Now that I think about it, I wonder if we can just eagerly encode `0x01` as 
the null sentinel byte 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to