Samyak2 commented on issue #7715:
URL: https://github.com/apache/arrow-rs/issues/7715#issuecomment-3058494490

   > 1. What should the "path" argument be? A String? A JSON path? Some 
structured thing (Vec)`?
   
   I don't know if this is the right issue to discuss this, but I have some 
thoughts. I don't think it should be a JSON path representation. `variant_get` 
in databricks only supports a subset of JSON path. Specifically only field 
accesses (`$.fieldName` or `$['fieldName']`) and array indexing (`$[0]`) are 
supported. Here's the 
[reference](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-json-path-expression)
 (look for variant) for this.
   
   We can use an enum specifically for variant paths that supports only these 
two accesses. Then it would be a simple vec of paths like @adriangb mentioned 
above (`Vec<VariantPathSegment>`).
   
   > 2\. Should we also provide a "requested data type" field? Similar to the 
data bricks function
   
   IMO, yes. If we don't specify a type, how do we infer what type to extract 
the value as? In a variant, the type could change for every row. Databricks 
chooses not to do any type inference and explicitly requires the user to 
specify a type for any variant access. When the type is not specified, like in 
`variant_col:someField`, databricks returns another variant instead of trying 
to infer a type. This behavior makes sense to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to