alamb opened a new issue, #7893:
URL: https://github.com/apache/arrow-rs/issues/7893

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   - part of https://github.com/apache/arrow-rs/issues/6736
   
   Now that we have good structures to validate and manipulate Variants 
programmatically in Rust APIs we need to low level kernels to manipulate the 
objects so we can query them in high level languages
   
   The goal is to implement queries such as the following in DataFusion and 
other arrow-rs based engines that selects fields or elements from an object
   
   Examples
   ```sql
   -- Extract the field "my_field" from the variant
   SELECT v["my_field"] FROM my_table;
   -- Extract the second element from the list stored in the field "x" of v
   SELECT v["x"][2] FROM my_table;
   -- Cast the field stored in "name" as a string
   SELECT CAST(v["x"]) AS VARCHAR
   ```
   
   
   **Describe the solution you'd like**
   To implement this kind of access, I think we need a kernel that uses the 
Rust `Variant` API to extract and data cast the data
   
   **Describe alternatives you've considered**
   
   Here is a proposal based on the `variant_get` function from DataBricks and 
feedback from @Samyak2 and @adriangb  on 
https://github.com/apache/arrow-rs/issues/7715#issuecomment-3058466627
   
   I am sure the lifetimes need some more work
   
   ```rust
   /// Given a StructArray with an array with Variant values (stored as a 
StructArray with 
   /// `metadata`, `value`, and optionally `typed_value` fields)
   /// returns the specified field or element 
   pub fn variant_get(variant_array: StructArray, options: GetOptions<'_>) -> 
Result<ArrayRef> { 
   ..
   }
   
   /// Controls the action of the variant_get kernel
   ///
   /// If `as_type` is specified `cast_options` controls what to do if the 
   /// 
   struct GetOptions<'a> {
     /// What path to extract
     path: VariantPath, 
     /// if `as_type` is None, the returned array will itself be a StructArray 
with Variant values
     ///
     /// if `as_type` is `Some(type)` the field is returned as the specified 
type if possible. To specify returning
     /// a Variant, pass a Field with variant type in the metadata.
     as_type: Option<&Field>, 
     /// Controls the casting behavior (e.g. error vs substituting null on cast 
error) 
     cast_options: CastOptions,
   }
   
   /// Represents a qualified path to a potential subfield of an element
   struct VariantPath(Vec<VariantPathElement>);
   
   /// Element of a path
   enum VariantPathElement<'a> {
     /// Access field with name `name`
     Field {
       name: Cow<'a, str>
     }, 
     /// Access the list element offset
     Index {
       offset: usize
   }
   
   ```
   
   
   
   ### Prior Art
    Here is a databricks function that does this: 
https://docs.databricks.com/gcp/en/sql/language-manual/functions/variant_get
   
   
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to