liamzwbao commented on code in PR #8831:
URL: https://github.com/apache/arrow-rs/pull/8831#discussion_r2565824405


##########
parquet-variant-compute/src/shred_variant.rs:
##########
@@ -236,6 +254,285 @@ impl<'a> VariantToShreddedPrimitiveVariantRowBuilder<'a> {
     }
 }
 
+pub(crate) struct VariantToShreddedArrayVariantRowBuilder<'a> {
+    value_builder: VariantValueArrayBuilder,
+    typed_value_builder: ArrayVariantToArrowRowBuilder<'a>,
+}
+
+impl<'a> VariantToShreddedArrayVariantRowBuilder<'a> {
+    fn try_new(
+        data_type: &'a DataType,
+        cast_options: &'a CastOptions,
+        capacity: usize,
+    ) -> Result<Self> {
+        Ok(Self {
+            value_builder: VariantValueArrayBuilder::new(capacity),
+            typed_value_builder: ArrayVariantToArrowRowBuilder::try_new(
+                data_type,
+                cast_options,
+                capacity,
+            )?,
+        })
+    }
+
+    fn append_null(&mut self) -> Result<()> {
+        self.value_builder.append_value(Variant::Null);
+        self.typed_value_builder.append_null();
+        Ok(())
+    }
+
+    fn append_value(&mut self, value: Variant<'_, '_>) -> Result<bool> {
+        // If the value is not an array, typed_value must be null.
+        // If the value is an array, value must be null.
+        match value {
+            Variant::List(list) => {
+                self.value_builder.append_null();
+                self.typed_value_builder.append_value(list)?;
+                Ok(true)
+            }
+            other => {
+                self.value_builder.append_value(other);
+                self.typed_value_builder.append_null();
+                Ok(false)
+            }
+        }
+    }
+
+    fn finish(self) -> Result<(BinaryViewArray, ArrayRef, Option<NullBuffer>)> 
{
+        Ok((
+            self.value_builder.build()?,
+            self.typed_value_builder.finish()?,
+            // All elements of an array must be present (not missing) because
+            // the array Variant encoding does not allow missing elements
+            None,
+        ))
+    }
+}
+
+enum ArrayVariantToArrowRowBuilder<'a> {
+    List(VariantToListArrowRowBuilder<'a, i32>),
+    LargeList(VariantToListArrowRowBuilder<'a, i64>),
+    ListView(VariantToListViewArrowRowBuilder<'a, i32>),
+    LargeListView(VariantToListViewArrowRowBuilder<'a, i64>),

Review Comment:
   I tried to refactor the code using `ListLikeArrayBuilder`, but it doesn’t 
help much ([draft PR](https://github.com/liamzwbao/arrow-rs/pull/1/files)). 
Plus, it would also force the builder into a boxed dyn trait, adding overhead 
versus the enum approach.
   
   I think the current impl actaully follow what we did in `varaint_to_arrow`, 
the code structure looks like below
   ```
   Variant (VariantToShreddedVariantRowBuilder)
   ├─ Primitive (VariantToShreddedPrimitiveVariantRowBuilder)
   │  ├─ Int/Float group 
(PrimitiveVariantToArrowRowBuilder::VariantToPrimitiveArrowRowBuilder):
   │  │    Null, Boolean, Int8/16/32/64, UInt8/16/32/64,
   │  │    Float16/32/64, Decimal32/64/128/256
   │  ├─ Time-like 
(PrimitiveVariantToArrowRowBuilder::VariantToTimestampArrowRowBuilder):
   │  │    Date32, Time64(us), Timestamp(us/ns) [tz or ntz]
   │  └─ String/Binary 
(PrimitiveVariantToArrowRowBuilder::VariantToStringArrowRowBuilder/VariantToBinaryArrowRowBuilder):
   │       Utf8, Utf8View, Binary, BinaryView
   ├─ Array (VariantToShreddedArrayVariantRowBuilder)
   │  ├─ List          
(ArrayVariantToArrowRowBuilder::VariantToListArrowRowBuilder)
   │  ├─ LargeList     
(ArrayVariantToArrowRowBuilder::VariantToListArrowRowBuilder)
   │  ├─ ListView      
(ArrayVariantToArrowRowBuilder::VariantToListViewArrowRowBuilder)
   │  └─ LargeListView 
(ArrayVariantToArrowRowBuilder::VariantToListViewArrowRowBuilder)
   │      └─ each element uses 
make_variant_to_shredded_variant_arrow_row_builder recursively
   └─ Object (VariantToShreddedObjectVariantRowBuilder)
   ```
   
   `StringLikeArrayBuilder` and `BinaryLikeArrayBuilder` mainly simplify 
`VariantToStringArrowRowBuilder` and `VariantToBinaryArrowRowBuilder` but still 
need dispatch from `PrimitiveVariantToArrowRowBuilder`. In theory, 
`ListLikeArrayBuilder` could simplify `nulls`, `offsets`, and `sizes` via 
`GenericList(View)Builder`, but we’d need a lot of adapters, because 
`element_builder` is Box<VariantToShreddedVariantRowBuilder<'a>> and doesn’t 
implement `ArrayBuilder` directly to make the trait work in our case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to