alamb commented on code in PR #7919:
URL: https://github.com/apache/arrow-rs/pull/7919#discussion_r2203367153
##########
parquet-variant-compute/src/variant_get.rs:
##########
@@ -0,0 +1,265 @@
+use std::sync::Arc;
+
+use arrow::{
+ array::{
+ Array, ArrayRef, ArrowPrimitiveType, BinaryArray, PrimitiveArray,
PrimitiveBuilder,
+ StructArray,
+ },
+ compute::CastOptions,
+ datatypes::UInt64Type,
+ error::Result,
+};
+use arrow_schema::{ArrowError, DataType, Field};
+use parquet_variant::Variant;
+
+use crate::utils::variant_from_struct_array;
+
+/// Returns an array with the specified path extracted from the variant values.
+pub fn variant_get(input: &ArrayRef, options: GetOptions) -> Result<ArrayRef> {
Review Comment:
I agree the row based approach will likely be slower for non-shredded
variants, but it will always potentially be needed in some cases (for example
when the source arrays are not BinaryView)
If we have the `get_path` method, I think we can potentially implement fast
copies of variants by playing games with pointers -- basically by checking if
the return variant has a pointer into the same buffer of the BinaryViewArray we
can make a view that points there.
However, i think that will be somewhat tricky and require some `unsafe` so I
suggest we get the first version in plae that does the copy, and once we have
it working (and tests written, etc) then I think we'll be in a better position
to optimize
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]