scovich commented on code in PR #8360:
URL: https://github.com/apache/arrow-rs/pull/8360#discussion_r2354365293


##########
parquet-variant-compute/src/variant_to_arrow.rs:
##########
@@ -246,36 +249,40 @@ where
 
 /// Builder for creating VariantArray output (for path extraction without type 
conversion)
 pub(crate) struct VariantToBinaryVariantArrowRowBuilder {
-    builder: VariantArrayBuilder,
+    metadata: BinaryViewArray,
+    builder: VariantValueArrayBuilder,
+    nulls: NullBufferBuilder,
 }
 
 impl VariantToBinaryVariantArrowRowBuilder {
-    fn new(capacity: usize) -> Self {
+    fn new(metadata: BinaryViewArray, capacity: usize) -> Self {
         Self {
-            builder: VariantArrayBuilder::new(capacity),
+            metadata,
+            builder: VariantValueArrayBuilder::new(capacity),
+            nulls: NullBufferBuilder::new(capacity),
         }
     }
 }
 
 impl VariantToBinaryVariantArrowRowBuilder {
     fn append_null(&mut self) -> Result<()> {
         self.builder.append_null();
+        self.nulls.append_null();
         Ok(())
     }
 
     fn append_value(&mut self, value: &Variant<'_, '_>) -> Result<bool> {
-        // TODO: We need a way to convert a Variant directly to bytes. In 
particular, we want to
-        // just copy across the underlying value byte slice of a 
`Variant::Object` or
-        // `Variant::List`, without any interaction with a `VariantMetadata` 
(because the shredding
-        // spec requires us to reuse the existing metadata when unshredding).
-        //
-        // One could _probably_ emulate this with 
parquet_variant::VariantBuilder, but it would do a
-        // lot of unnecessary work and would also create a new metadata column 
we don't need.

Review Comment:
   @alamb -- I just realized while putting up 
https://github.com/apache/arrow-rs/pull/8366/ that the new 
`VariantValueArrayBuilder` is perfect for fixing this TODO (in addition to 
supporting shredding).
   
   So now it has an immediate use, instead of only being a building block for 
the next PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to