[ https://issues.apache.org/jira/browse/SPARK-41538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-41538: ------------------------------------ Assignee: Gengliang Wang (was: Apache Spark) > Metadata column should be appended at the end of project list > ------------------------------------------------------------- > > Key: SPARK-41538 > URL: https://issues.apache.org/jira/browse/SPARK-41538 > Project: Spark > Issue Type: Task > Components: SQL > Affects Versions: 3.3.2, 3.4.0 > Reporter: Gengliang Wang > Assignee: Gengliang Wang > Priority: Major > > For the following query: > > {code:java} > CREATE TABLE table_1 ( > a ARRAY<STRING>, > s STRUCT<id: STRING>) > USING parquet; > CREATE VIEW view_1 (id) > AS WITH source AS ( > SELECT * FROM table_1 > ), > renamed AS ( > SELECT > s.id > FROM source > ) > SELECT id FROM renamed; > with foo AS ( > SELECT 'a' as id > ), > bar AS ( > SELECT 'a' as id > ) > SELECT > 1 > FROM foo > FULL OUTER JOIN bar USING(id) > FULL OUTER JOIN view_1 USING(id) > WHERE foo.id IS NOT NULL{code} > There will be the following error: > > {code:java} > class org.apache.spark.sql.types.ArrayType cannot be cast to class > org.apache.spark.sql.types.StructType (org.apache.spark.sql.types.ArrayType > and org.apache.spark.sql.types.StructType are in unnamed module of loader > 'app') > java.lang.ClassCastException: class org.apache.spark.sql.types.ArrayType > cannot be cast to class org.apache.spark.sql.types.StructType > (org.apache.spark.sql.types.ArrayType and > org.apache.spark.sql.types.StructType are in unnamed module of loader 'app') > at > org.apache.spark.sql.catalyst.expressions.GetStructField.childSchema$lzycompute(complexTypeExtractors.scala:108) > at > org.apache.spark.sql.catalyst.expressions.GetStructField.childSchema(complexTypeExtractors.scala:108) > at > org.apache.spark.sql.catalyst.expressions.GetStructField.dataType(complexTypeExtractors.scala:114) > at > org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:193) > at > org.apache.spark.sql.catalyst.expressions.AliasHelper$$anonfun$getAliasMap$1.applyOrElse(AliasHelper.scala:50) > at > org.apache.spark.sql.catalyst.expressions.AliasHelper$$anonfun$getAliasMap$1.applyOrElse(AliasHelper.scala:50) > at scala.collection.immutable.List.collect(List.scala:315) > at > org.apache.spark.sql.catalyst.expressions.AliasHelper.getAliasMap(AliasHelper.scala:50) > at > org.apache.spark.sql.catalyst.expressions.AliasHelper.getAliasMap$(AliasHelper.scala:47) > at > org.apache.spark.sql.catalyst.optimizer.CollapseProject$.getAliasMap(Optimizer.scala:992) > at > org.apache.spark.sql.catalyst.optimizer.CollapseProject$.canCollapseExpressions(Optimizer.scala:1029){code} > This is caused by the inconsistent metadata column positions in the following > two nodes: > * Table relation: at the ending position > * Project list: at the beginning position > When the InlineCTE rule executes, the metadata column in project is wrongly > combined with the table output. > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org