Bruce Robbins created SPARK-26707:
-------------------------------------

             Summary: Insert into table with single struct column fails
                 Key: SPARK-26707
                 URL: https://issues.apache.org/jira/browse/SPARK-26707
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.0, 2.3.2, 2.2.3, 3.0.0
            Reporter: Bruce Robbins


This works:
{noformat}
scala> sql("select named_struct('d1', 123) c1, 12 
c2").write.format("parquet").saveAsTable("structtbl2")

scala> sql("show create table structtbl2").show(truncate=false)
+---------------------------------------------------------------------------+
|createtab_stmt                                                             |
+---------------------------------------------------------------------------+
|CREATE TABLE `structtbl2` (`c1` STRUCT<`d1`: INT>, `c2` INT)
USING parquet
|
+---------------------------------------------------------------------------+

scala> sql("insert into structtbl2 values (struct(789), 17)")
res2: org.apache.spark.sql.DataFrame = []

scala> sql("select * from structtbl2").show
+-----+---+
|   c1| c2|
+-----+---+
|[789]| 17|
|[123]| 12|
+-----+---+
scala>
{noformat}
However, if the table's only column is the struct column, the insert does not 
work:
{noformat}
scala> sql("select named_struct('d1', 123) 
c1").write.format("parquet").saveAsTable("structtbl1")

scala> sql("show create table structtbl1").show(truncate=false)
+-----------------------------------------------------------------+
|createtab_stmt                                                   |
+-----------------------------------------------------------------+
|CREATE TABLE `structtbl1` (`c1` STRUCT<`d1`: INT>)
USING parquet
|
+-----------------------------------------------------------------+

scala> sql("insert into structtbl1 values (struct(789))")
org.apache.spark.sql.AnalysisException: cannot resolve '`col1`' due to data 
type mismatch: cannot cast int to struct<d1:int>;;
'InsertIntoHadoopFsRelationCommand 
file:/Users/brobbins/github/spark_upstream/spark-warehouse/structtbl1, false, 
Parquet, Map(path -> 
file:/Users/brobbins/github/spark_upstream/spark-warehouse/structtbl1), Append, 
CatalogTable(
...etc...
{noformat}
I can work around it by using a named_struct as the value:
{noformat}
scala> sql("insert into structtbl1 values (named_struct('d1',789))")
res7: org.apache.spark.sql.DataFrame = []

scala> sql("select * from structtbl1").show
+-----+
|   c1|
+-----+
|[789]|
|[123]|
+-----+

scala>
{noformat}
My guess is that I just don't understand how structs work. But maybe this is a 
bug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to