liamzwbao commented on code in PR #8831:
URL: https://github.com/apache/arrow-rs/pull/8831#discussion_r2579390251


##########
parquet-variant-compute/src/shred_variant.rs:
##########
@@ -236,6 +254,285 @@ impl<'a> VariantToShreddedPrimitiveVariantRowBuilder<'a> {
     }
 }
 
+pub(crate) struct VariantToShreddedArrayVariantRowBuilder<'a> {
+    value_builder: VariantValueArrayBuilder,
+    typed_value_builder: ArrayVariantToArrowRowBuilder<'a>,
+}
+
+impl<'a> VariantToShreddedArrayVariantRowBuilder<'a> {
+    fn try_new(
+        data_type: &'a DataType,
+        cast_options: &'a CastOptions,
+        capacity: usize,
+    ) -> Result<Self> {
+        Ok(Self {
+            value_builder: VariantValueArrayBuilder::new(capacity),
+            typed_value_builder: ArrayVariantToArrowRowBuilder::try_new(
+                data_type,
+                cast_options,
+                capacity,
+            )?,
+        })
+    }
+
+    fn append_null(&mut self) -> Result<()> {
+        self.value_builder.append_value(Variant::Null);
+        self.typed_value_builder.append_null();
+        Ok(())
+    }
+
+    fn append_value(&mut self, value: Variant<'_, '_>) -> Result<bool> {
+        // If the value is not an array, typed_value must be null.
+        // If the value is an array, value must be null.
+        match value {
+            Variant::List(list) => {
+                self.value_builder.append_null();
+                self.typed_value_builder.append_value(list)?;
+                Ok(true)
+            }
+            other => {
+                self.value_builder.append_value(other);
+                self.typed_value_builder.append_null();
+                Ok(false)
+            }
+        }
+    }
+
+    fn finish(self) -> Result<(BinaryViewArray, ArrayRef, Option<NullBuffer>)> 
{
+        Ok((
+            self.value_builder.build()?,
+            self.typed_value_builder.finish()?,
+            // All elements of an array must be present (not missing) because
+            // the array Variant encoding does not allow missing elements
+            None,
+        ))
+    }
+}
+
+enum ArrayVariantToArrowRowBuilder<'a> {
+    List(VariantToListArrowRowBuilder<'a, i32>),
+    LargeList(VariantToListArrowRowBuilder<'a, i64>),
+    ListView(VariantToListViewArrowRowBuilder<'a, i32>),
+    LargeListView(VariantToListViewArrowRowBuilder<'a, i64>),

Review Comment:
   Emmm, I don't think there is a double enum dispatch in this PR, this is the 
only place where we match the `DataType` to specific builders. 
`ArrayVariantToArrowRowBuilder` is on the same level as 
`PrimitiveVariantToArrowRowBuilder` and `ObjectVariantToArrowRowBuilder`
   
   In `PrimitiveVariantToArrowRowBuilder` (top-level enum): 
https://github.com/apache/arrow-rs/blob/d212713289d9cd474a849cc7da0d6c125f3141f1/parquet-variant-compute/src/variant_to_arrow.rs#L66-L71
   it dispatches the builders for each concrete types. 
   
   And this PR did the same thing here in `ArrayVariantToArrowRowBuilder`, 
where this top-level enum dispatches 4 different list builders for each 
concrete list types. 
   
   BTW, I updated the [draft 
PR](https://github.com/liamzwbao/arrow-rs/pull/1/files) to make 
`ArrayVariantToArrowRowBuilder` fairly slim, but it basically moves the `match 
data_type` from this builder to the [caller 
side](https://github.com/liamzwbao/arrow-rs/pull/1/files#diff-fd0ec8577d0f6349176a0dfa6a1d201b79823a965c45b6b63400c5fe360e348cR261-R266)
 and does not reduce duplicate much.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to