[ https://issues.apache.org/jira/browse/SPARK-38530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan reassigned SPARK-38530: ----------------------------------- Assignee: Min Yang > GeneratorNestedColumnAliasing does not work correctly for some expressions > -------------------------------------------------------------------------- > > Key: SPARK-38530 > URL: https://issues.apache.org/jira/browse/SPARK-38530 > Project: Spark > Issue Type: Bug > Components: Optimizer > Affects Versions: 3.2.1 > Reporter: Min Yang > Assignee: Min Yang > Priority: Major > Fix For: 3.3.0 > > > [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala#L226] > The code to collect ExtractValue expressions is wrong. We should do it in a > bottom up way instead of only check 2 levels. It can cause incorrect result > if the expression looks like ExtractValue(ExtractValue(some_other_expr)). > > An example to trigger the bug is: > > input: <col1: array<struct<a: int, b: struct<a: struct<a: int, b: int>, b: > int>>>> > > Project(ExtractValue(ExtractValue(CaseWhen([col.a == 1, col.b]), "a"), "a") > - Generate(Explode(col1)) > > We will try to incorrectly push down the whole expression into the input of > the Explode, now the input of CaseWhen has array<...> as input so we will get > wrong result. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org