Gengliang Wang created SPARK-41538: -------------------------------------- Summary: Metadata column should be appended at the end of project list Key: SPARK-41538 URL: https://issues.apache.org/jira/browse/SPARK-41538 Project: Spark Issue Type: Task Components: SQL Affects Versions: 3.3.2, 3.4.0 Reporter: Gengliang Wang Assignee: Gengliang Wang
For the following query: {code:java} CREATE TABLE table_1 ( a ARRAY<STRING>, s STRUCT<id: STRING>) USING parquet; CREATE VIEW view_1 (id) AS WITH source AS ( SELECT * FROM table_1 ), renamed AS ( SELECT s.id FROM source ) SELECT id FROM renamed; with foo AS ( SELECT 'a' as id ), bar AS ( SELECT 'a' as id ) SELECT 1 FROM foo FULL OUTER JOIN bar USING(id) FULL OUTER JOIN view_1 USING(id) WHERE foo.id IS NOT NULL{code} There will be the following error: {code:java} class org.apache.spark.sql.types.ArrayType cannot be cast to class org.apache.spark.sql.types.StructType (org.apache.spark.sql.types.ArrayType and org.apache.spark.sql.types.StructType are in unnamed module of loader 'app') java.lang.ClassCastException: class org.apache.spark.sql.types.ArrayType cannot be cast to class org.apache.spark.sql.types.StructType (org.apache.spark.sql.types.ArrayType and org.apache.spark.sql.types.StructType are in unnamed module of loader 'app') at org.apache.spark.sql.catalyst.expressions.GetStructField.childSchema$lzycompute(complexTypeExtractors.scala:108) at org.apache.spark.sql.catalyst.expressions.GetStructField.childSchema(complexTypeExtractors.scala:108) at org.apache.spark.sql.catalyst.expressions.GetStructField.dataType(complexTypeExtractors.scala:114) at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:193) at org.apache.spark.sql.catalyst.expressions.AliasHelper$$anonfun$getAliasMap$1.applyOrElse(AliasHelper.scala:50) at org.apache.spark.sql.catalyst.expressions.AliasHelper$$anonfun$getAliasMap$1.applyOrElse(AliasHelper.scala:50) at scala.collection.immutable.List.collect(List.scala:315) at org.apache.spark.sql.catalyst.expressions.AliasHelper.getAliasMap(AliasHelper.scala:50) at org.apache.spark.sql.catalyst.expressions.AliasHelper.getAliasMap$(AliasHelper.scala:47) at org.apache.spark.sql.catalyst.optimizer.CollapseProject$.getAliasMap(Optimizer.scala:992) at org.apache.spark.sql.catalyst.optimizer.CollapseProject$.canCollapseExpressions(Optimizer.scala:1029){code} This is caused by the inconsistent metadata column positions in the following two nodes: * Table relation: at the ending position * Project list: at the beginning position When the InlineCTE rule executes, the metadata column in project is wrongly combined with the table output. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org