[ https://issues.apache.org/jira/browse/SPARK-26707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-26707: ------------------------------------ Assignee: Apache Spark > Insert into table with single struct column fails > ------------------------------------------------- > > Key: SPARK-26707 > URL: https://issues.apache.org/jira/browse/SPARK-26707 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.3, 2.3.2, 2.4.0, 3.0.0 > Reporter: Bruce Robbins > Assignee: Apache Spark > Priority: Minor > > This works: > {noformat} > scala> sql("select named_struct('d1', 123) c1, 12 > c2").write.format("parquet").saveAsTable("structtbl2") > scala> sql("show create table structtbl2").show(truncate=false) > +---------------------------------------------------------------------------+ > |createtab_stmt | > +---------------------------------------------------------------------------+ > |CREATE TABLE `structtbl2` (`c1` STRUCT<`d1`: INT>, `c2` INT) > USING parquet > | > +---------------------------------------------------------------------------+ > scala> sql("insert into structtbl2 values (struct(789), 17)") > res2: org.apache.spark.sql.DataFrame = [] > scala> sql("select * from structtbl2").show > +-----+---+ > | c1| c2| > +-----+---+ > |[789]| 17| > |[123]| 12| > +-----+---+ > scala> > {noformat} > However, if the table's only column is the struct column, the insert does not > work: > {noformat} > scala> sql("select named_struct('d1', 123) > c1").write.format("parquet").saveAsTable("structtbl1") > scala> sql("show create table structtbl1").show(truncate=false) > +-----------------------------------------------------------------+ > |createtab_stmt | > +-----------------------------------------------------------------+ > |CREATE TABLE `structtbl1` (`c1` STRUCT<`d1`: INT>) > USING parquet > | > +-----------------------------------------------------------------+ > scala> sql("insert into structtbl1 values (struct(789))") > org.apache.spark.sql.AnalysisException: cannot resolve '`col1`' due to data > type mismatch: cannot cast int to struct<d1:int>;; > 'InsertIntoHadoopFsRelationCommand > file:/Users/brobbins/github/spark_upstream/spark-warehouse/structtbl1, false, > Parquet, Map(path -> > file:/Users/brobbins/github/spark_upstream/spark-warehouse/structtbl1), > Append, CatalogTable( > ...etc... > {noformat} > I can work around it by using a named_struct as the value: > {noformat} > scala> sql("insert into structtbl1 values (named_struct('d1',789))") > res7: org.apache.spark.sql.DataFrame = [] > scala> sql("select * from structtbl1").show > +-----+ > | c1| > +-----+ > |[789]| > |[123]| > +-----+ > scala> > {noformat} > My guess is that I just don't understand how structs work. But maybe this is > a bug. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org