[ https://issues.apache.org/jira/browse/SPARK-47217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Asif updated SPARK-47217: ------------------------- Description: In case of some flavours of nested joins involving repetition of relation, the projected columns when passed to the DataFrame.select API , as form of df.column , can result in plan resolution failure due to attribute resolution not happening. A scenario in which this happens is {noformat} Project ( dataframe A.column("col-a") ) | Join2 | | Join1 DataFrame A | DataFrame A DataFrame B {noformat} In such cases, If it so happens that Join2 - right leg DataFrame A gets re-aliased due to De-Duplication of relations, and if the project uses Column definition obtained from DataFrame A, its exprId will not match the re-aliased Join2 - right Leg- DataFrame A , causing resolution failure. was: In case of some flavours of self join queries or nested joins involving repetition of relation, the projected columns when passed to the DataFrame.select API , as form of df.column , can result in plan resolution failure due to attribute resolution not happening. A scenario in which this happens is {noformat} Project ( dataframe A.column("col-a") ) | Join2 | | Join1 DataFrame A | DataFrame A DataFrame B {noformat} In such cases, If it so happens that Join2 - right leg DataFrame A gets re-aliased due to De-Duplication of relations, and if the project uses Column definition obtained from DataFrame A, its exprId will not match the re-aliased Join2 - right Leg- DataFrame A , causing resolution failure. > De-duplication of Relations in Joins, can result in plan resolution failure > --------------------------------------------------------------------------- > > Key: SPARK-47217 > URL: https://issues.apache.org/jira/browse/SPARK-47217 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.5.1 > Reporter: Asif > Priority: Major > Labels: Spark-SQL > > In case of some flavours of nested joins involving repetition of relation, > the projected columns when passed to the DataFrame.select API , as form of > df.column , can result in plan resolution failure due to attribute resolution > not happening. > A scenario in which this happens is > {noformat} > > Project ( dataframe A.column("col-a") ) > | > Join2 > | | > Join1 DataFrame A > | > DataFrame A DataFrame B > {noformat} > In such cases, If it so happens that Join2 - right leg DataFrame A gets > re-aliased due to De-Duplication of relations, and if the project uses Column > definition obtained from DataFrame A, its exprId will not match the > re-aliased Join2 - right Leg- DataFrame A , causing resolution failure. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org