[jira] [Assigned] (SPARK-41538) Metadata column should be appended at the end of project list

Apache Spark (Jira) Thu, 15 Dec 2022 16:13:08 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-41538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Apache Spark reassigned SPARK-41538:
------------------------------------

    Assignee: Gengliang Wang  (was: Apache Spark)

> Metadata column should be appended at the end of project list
> -------------------------------------------------------------
>
>                 Key: SPARK-41538
>                 URL: https://issues.apache.org/jira/browse/SPARK-41538
>             Project: Spark
>          Issue Type: Task
>          Components: SQL
>    Affects Versions: 3.3.2, 3.4.0
>            Reporter: Gengliang Wang
>            Assignee: Gengliang Wang
>            Priority: Major
>
> For the following query:
>  
> {code:java}
> CREATE TABLE table_1 (
>   a ARRAY<STRING>,
>  s STRUCT<id: STRING>)
> USING parquet;
> CREATE VIEW view_1 (id)
> AS WITH source AS (
>     SELECT * FROM table_1
> ),
> renamed AS (
>     SELECT
>      s.id
>     FROM source
> )
> SELECT id FROM renamed;
> with foo AS (
>   SELECT 'a' as id
> ),
> bar AS (
>   SELECT 'a' as id
> )
> SELECT
>   1
> FROM foo
> FULL OUTER JOIN bar USING(id)
> FULL OUTER JOIN view_1 USING(id)
> WHERE foo.id IS NOT NULL{code}
> There will be the following error:
>  
> {code:java}
> class org.apache.spark.sql.types.ArrayType cannot be cast to class 
> org.apache.spark.sql.types.StructType (org.apache.spark.sql.types.ArrayType 
> and org.apache.spark.sql.types.StructType are in unnamed module of loader 
> 'app')
> java.lang.ClassCastException: class org.apache.spark.sql.types.ArrayType 
> cannot be cast to class org.apache.spark.sql.types.StructType 
> (org.apache.spark.sql.types.ArrayType and 
> org.apache.spark.sql.types.StructType are in unnamed module of loader 'app')
>     at 
> org.apache.spark.sql.catalyst.expressions.GetStructField.childSchema$lzycompute(complexTypeExtractors.scala:108)
>     at 
> org.apache.spark.sql.catalyst.expressions.GetStructField.childSchema(complexTypeExtractors.scala:108)
>     at 
> org.apache.spark.sql.catalyst.expressions.GetStructField.dataType(complexTypeExtractors.scala:114)
>     at 
> org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:193)
>     at 
> org.apache.spark.sql.catalyst.expressions.AliasHelper$$anonfun$getAliasMap$1.applyOrElse(AliasHelper.scala:50)
>     at 
> org.apache.spark.sql.catalyst.expressions.AliasHelper$$anonfun$getAliasMap$1.applyOrElse(AliasHelper.scala:50)
>     at scala.collection.immutable.List.collect(List.scala:315)
>     at 
> org.apache.spark.sql.catalyst.expressions.AliasHelper.getAliasMap(AliasHelper.scala:50)
>     at 
> org.apache.spark.sql.catalyst.expressions.AliasHelper.getAliasMap$(AliasHelper.scala:47)
>     at 
> org.apache.spark.sql.catalyst.optimizer.CollapseProject$.getAliasMap(Optimizer.scala:992)
>     at 
> org.apache.spark.sql.catalyst.optimizer.CollapseProject$.canCollapseExpressions(Optimizer.scala:1029){code}
> This is caused by the inconsistent metadata column positions in the following 
> two nodes:
>  * Table relation: at the ending position
>  * Project list: at the beginning position
> When the InlineCTE rule executes, the metadata column in project is wrongly 
> combined with the table output.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-41538) Metadata column should be appended at the end of project list

Reply via email to