Bruce Robbins created SPARK-26707: ------------------------------------- Summary: Insert into table with single struct column fails Key: SPARK-26707 URL: https://issues.apache.org/jira/browse/SPARK-26707 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.0, 2.3.2, 2.2.3, 3.0.0 Reporter: Bruce Robbins
This works: {noformat} scala> sql("select named_struct('d1', 123) c1, 12 c2").write.format("parquet").saveAsTable("structtbl2") scala> sql("show create table structtbl2").show(truncate=false) +---------------------------------------------------------------------------+ |createtab_stmt | +---------------------------------------------------------------------------+ |CREATE TABLE `structtbl2` (`c1` STRUCT<`d1`: INT>, `c2` INT) USING parquet | +---------------------------------------------------------------------------+ scala> sql("insert into structtbl2 values (struct(789), 17)") res2: org.apache.spark.sql.DataFrame = [] scala> sql("select * from structtbl2").show +-----+---+ | c1| c2| +-----+---+ |[789]| 17| |[123]| 12| +-----+---+ scala> {noformat} However, if the table's only column is the struct column, the insert does not work: {noformat} scala> sql("select named_struct('d1', 123) c1").write.format("parquet").saveAsTable("structtbl1") scala> sql("show create table structtbl1").show(truncate=false) +-----------------------------------------------------------------+ |createtab_stmt | +-----------------------------------------------------------------+ |CREATE TABLE `structtbl1` (`c1` STRUCT<`d1`: INT>) USING parquet | +-----------------------------------------------------------------+ scala> sql("insert into structtbl1 values (struct(789))") org.apache.spark.sql.AnalysisException: cannot resolve '`col1`' due to data type mismatch: cannot cast int to struct<d1:int>;; 'InsertIntoHadoopFsRelationCommand file:/Users/brobbins/github/spark_upstream/spark-warehouse/structtbl1, false, Parquet, Map(path -> file:/Users/brobbins/github/spark_upstream/spark-warehouse/structtbl1), Append, CatalogTable( ...etc... {noformat} I can work around it by using a named_struct as the value: {noformat} scala> sql("insert into structtbl1 values (named_struct('d1',789))") res7: org.apache.spark.sql.DataFrame = [] scala> sql("select * from structtbl1").show +-----+ | c1| +-----+ |[789]| |[123]| +-----+ scala> {noformat} My guess is that I just don't understand how structs work. But maybe this is a bug. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org