[ 
https://issues.apache.org/jira/browse/SPARK-49443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-49443:
--------------------------------------

    Assignee: Apache Spark

> Implement to_variant_object expression and make schema_of_variant expressions 
> print OBJECT for for Variant Objects
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-49443
>                 URL: https://issues.apache.org/jira/browse/SPARK-49443
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Harsh Motwani
>            Assignee: Apache Spark
>            Priority: Major
>              Labels: pull-request-available
>
> Cast from structs to variant objects should not be legal since variant 
> objects are unordered bags of key-value pairs while structs are ordered sets 
> of elements of fixed types. Therefore, casts between structs and variant 
> objects do not behave like casts between structs. Example (produced by Serge 
> Rielau):
> {code:java}
> scala> spark.sql("SELECT cast(named_struct('c', 1, 'b', '2') as struct<b int, 
> c int>)").show()
> +------------------------+
> |named_struct(c, 1, b, 2)|
> +------------------------+
> |{1, 2}|
> +------------------------+
> Passing a struct into VARIANT loses the position
> scala> spark.sql("SELECT cast(named_struct('c', 1, 'b', '2')::variant as 
> struct<b int, c int>)").show()
> +-----------------------------------------+
> |CAST(named_struct(c, 1, b, 2) AS VARIANT)|
> +-----------------------------------------+
> |{2, 1}|
> +-----------------------------------------+
> {code}
> Casts from maps to variant objects should also not be legal since they 
> represent completely orthogonal data types. Maps can represent a variable 
> number of key value pairs based on just a key and value type in the schema 
> but in objects, the schema (produced by schema_of_variant expressions) will 
> have a type corresponding to each value in the object. Objects can have 
> values of different types while maps cannot and objects can only have string 
> keys while maps can also have complex keys.
> We should therefore prohibit the existing behavior of allowing explicit casts 
> from structs and maps to variants as the variant spec currently only supports 
> an object type which is remotely compatible with structs and maps. We should 
> introduce a new expression that converts schemas containing structs and maps 
> to variants. We will call it `to_variant_object`.
> Also, schema_of_variant and schema_of_variant_agg expressions currently print 
> STRUCT when Variant Objects are observed. We should also correct that to 
> OBJECT.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to