[ 
https://issues.apache.org/jira/browse/SPARK-38530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-38530.
---------------------------------
    Fix Version/s: 3.3.0
       Resolution: Fixed

Issue resolved by pull request 35866
[https://github.com/apache/spark/pull/35866]

> GeneratorNestedColumnAliasing does not work correctly for some expressions
> --------------------------------------------------------------------------
>
>                 Key: SPARK-38530
>                 URL: https://issues.apache.org/jira/browse/SPARK-38530
>             Project: Spark
>          Issue Type: Bug
>          Components: Optimizer
>    Affects Versions: 3.2.1
>            Reporter: Min Yang
>            Priority: Major
>             Fix For: 3.3.0
>
>
> [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala#L226]
> The code to collect ExtractValue expressions is wrong. We should do it in a 
> bottom up way instead of only check 2 levels. It can cause incorrect result 
> if the expression looks like ExtractValue(ExtractValue(some_other_expr)).
>  
> An example to trigger the bug is:
>  
> input: <col1: array<struct<a: int, b: struct<a: struct<a: int, b: int>, b: 
> int>>>>
>  
> Project(ExtractValue(ExtractValue(CaseWhen([col.a == 1, col.b]), "a"), "a")
> - Generate(Explode(col1))
>  
> We will try to incorrectly push down the whole expression into the input of 
> the Explode, now the input of CaseWhen has array<...> as input so we will get 
> wrong result.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to