scovich commented on code in PR #8360:
URL: https://github.com/apache/arrow-rs/pull/8360#discussion_r2354365293
##########
parquet-variant-compute/src/variant_to_arrow.rs:
##########
@@ -246,36 +249,40 @@ where
/// Builder for creating VariantArray output (for path extraction without type
conversion)
pub(crate) struct VariantToBinaryVariantArrowRowBuilder {
- builder: VariantArrayBuilder,
+ metadata: BinaryViewArray,
+ builder: VariantValueArrayBuilder,
+ nulls: NullBufferBuilder,
}
impl VariantToBinaryVariantArrowRowBuilder {
- fn new(capacity: usize) -> Self {
+ fn new(metadata: BinaryViewArray, capacity: usize) -> Self {
Self {
- builder: VariantArrayBuilder::new(capacity),
+ metadata,
+ builder: VariantValueArrayBuilder::new(capacity),
+ nulls: NullBufferBuilder::new(capacity),
}
}
}
impl VariantToBinaryVariantArrowRowBuilder {
fn append_null(&mut self) -> Result<()> {
self.builder.append_null();
+ self.nulls.append_null();
Ok(())
}
fn append_value(&mut self, value: &Variant<'_, '_>) -> Result<bool> {
- // TODO: We need a way to convert a Variant directly to bytes. In
particular, we want to
- // just copy across the underlying value byte slice of a
`Variant::Object` or
- // `Variant::List`, without any interaction with a `VariantMetadata`
(because the shredding
- // spec requires us to reuse the existing metadata when unshredding).
- //
- // One could _probably_ emulate this with
parquet_variant::VariantBuilder, but it would do a
- // lot of unnecessary work and would also create a new metadata column
we don't need.
Review Comment:
@alamb -- I just realized while putting up
https://github.com/apache/arrow-rs/pull/8366/ that this PR allows to very fix
this TODO (in addition to supporting shredding). So here it is.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]