Harsh Motwani created SPARK-49443:
-------------------------------------

             Summary: Implement to_variant_object expression and make 
schema_of_variant expressions print OBJECT for for Variant Objects
                 Key: SPARK-49443
                 URL: https://issues.apache.org/jira/browse/SPARK-49443
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Harsh Motwani


Cast from structs to variant objects should not be legal since variant objects 
are unordered bags of key-value pairs while structs are ordered sets of 
elements of fixed types. Therefore, casts between structs and variant objects 
do not behave like casts between structs. Example (produced by Serge Rielau):


{code:java}
scala> spark.sql("SELECT cast(named_struct('c', 1, 'b', '2') as struct<b int, c 
int>)").show()
+------------------------+
|named_struct(c, 1, b, 2)|

+------------------------+
|{1, 2}|

+------------------------+

Passing a struct into VARIANT loses the position
scala> spark.sql("SELECT cast(named_struct('c', 1, 'b', '2')::variant as 
struct<b int, c int>)").show()
+-----------------------------------------+
|CAST(named_struct(c, 1, b, 2) AS VARIANT)|

+-----------------------------------------+
|{2, 1}|

+-----------------------------------------+
{code}

Casts from maps to variant objects should also not be legal since they 
represent completely orthogonal data types. Maps can represent a variable 
number of key value pairs based on just a key and value type in the schema but 
in objects, the schema (produced by schema_of_variant expressions) will have a 
type corresponding to each value in the object. Objects can have values of 
different types while maps cannot and objects can only have string keys while 
maps can also have complex keys.

We should therefore prohibit the existing behavior of allowing explicit casts 
from structs and maps to variants as the variant spec currently only supports 
an object type which is remotely compatible with structs and maps. We should 
introduce a new expression that converts schemas containing structs and maps to 
variants. We will call it `to_variant_object`.

Also, schema_of_variant and schema_of_variant_agg expressions currently print 
STRUCT when Variant Objects are observed. We should also correct that to OBJECT.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to