[ 
https://issues.apache.org/jira/browse/SPARK-47217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Asif updated SPARK-47217:
-------------------------
    Description: 
In case of some flavours of nested self join queries,  the projected columns 
when passed to the DataFrame.select API ,  as form of df.column ,  can result 
in plan resolution failure due to attribute resolution not happening.

A scenario in which this happens is
                         
{noformat}
               
                          Project ( dataframe A.column("col-a") )
                                         |
                                      Join2
                          |                            | 
                       Join1                      DataFrame A      
                          |
         DataFrame A            DataFrame B

{noformat}


In such cases, If it so happens that  Join2 - right leg DataFrame A gets 
re-aliased due to De-Duplication of relations,  and if the project uses Column 
definition obtained from DataFrame A, its exprId will not match the re-aliased  
Join2  - right Leg- DataFrame A , causing resolution failure.

  was:
In case of some flavours of nested self join queries,  the projected columns 
when passed to the DataFrame.select API ,  as form of df.column ,  can result 
in plan resolution failure due to attribute resolution not happening.

A scenario in which this happens is
                                        
                               Project ( dataframe A.column("col-a") )
                                         |
                                      Join2
                          |                            DataFrame A      
                       Join1
                          |
DataFrame A                DataFrame B


In such cases, If it so happens that  Join2 - right leg DataFrame A gets 
re-aliased due to De-Duplication of relations,  and if the project uses Column 
definition obtained from DataFrame A, its exprId will not match the re-aliased  
Join2  - right Leg- DataFrame A , causing resolution failure.


> De-duplication of Relations in Joins, can result in plan resolution failure
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-47217
>                 URL: https://issues.apache.org/jira/browse/SPARK-47217
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.5.1
>            Reporter: Asif
>            Priority: Major
>              Labels: Spark-SQL
>
> In case of some flavours of nested self join queries,  the projected columns 
> when passed to the DataFrame.select API ,  as form of df.column ,  can result 
> in plan resolution failure due to attribute resolution not happening.
> A scenario in which this happens is
>                          
> {noformat}
>                
>                           Project ( dataframe A.column("col-a") )
>                                          |
>                                       Join2
>                           |                            | 
>                        Join1                      DataFrame A      
>                           |
>          DataFrame A            DataFrame B
> {noformat}
> In such cases, If it so happens that  Join2 - right leg DataFrame A gets 
> re-aliased due to De-Duplication of relations,  and if the project uses 
> Column definition obtained from DataFrame A, its exprId will not match the 
> re-aliased  Join2  - right Leg- DataFrame A , causing resolution failure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to