[ 
https://issues.apache.org/jira/browse/SPARK-33837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Su Qilong updated SPARK-33837:
------------------------------
    Description: 
For the following statements:
{code:java}
create table t1(id int, name string)
create table id(key int, name string);

select * from id where name in (select name from t1 where t1.id = id.key)
{code}
This query is supported by spark, but since table t1 also has an attribute 
named id, spark will raise an error like:
{noformat}
Can't extract value from id#83: need struct type but got int;
{noformat}
According to implementation in Spark's ResolveReference rule, the name 
resolving precedence is `db.table` => `table.attr` => `attr.innerfield`

So here id.key should be resolved as table id's attribute, rather than the 
innerfield of the attribute of table t1

The problem in this resolving bug lies in the subquery resolving. In 
ResolveSubquery rule, we first try to resolve subquery independently, and only 
when there're unresolved attribute in subquery, we try resolveOuterReferences. 

 

  was:
For the following statements:
{code:java}
create table t1(id int, name string)
create table id(key int, name string);

select * from id where name in (select name from t1 where t1.id = id.key)
{code}
This query is supported by spark, but since table t1 has an attribute named id, 
spark will raise an error like:
{noformat}
Can't extract value from id#83: need struct type but got int;

{noformat}
According to implementation in Spark's ResolveReference rule, the name 
resolving precedence is `db.table` => `table.attr` => `attr.innerfield`

So here id.key should be resolved as table id's attribute, rather than the 
innerfield of the attribute of table t1

The problem in this resolving bug lies in the subquery resolving. In 
ResolveSubquery rule, we first try to resolve subquery independently, and only 
when there're unresolved attribute in subquery, we try resolveOuterReferences. 

 


> Correlated subquery field resolve bug when inner table has a field with the 
> same name with outer table
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-33837
>                 URL: https://issues.apache.org/jira/browse/SPARK-33837
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.3
>            Reporter: Su Qilong
>            Priority: Major
>
> For the following statements:
> {code:java}
> create table t1(id int, name string)
> create table id(key int, name string);
> select * from id where name in (select name from t1 where t1.id = id.key)
> {code}
> This query is supported by spark, but since table t1 also has an attribute 
> named id, spark will raise an error like:
> {noformat}
> Can't extract value from id#83: need struct type but got int;
> {noformat}
> According to implementation in Spark's ResolveReference rule, the name 
> resolving precedence is `db.table` => `table.attr` => `attr.innerfield`
> So here id.key should be resolved as table id's attribute, rather than the 
> innerfield of the attribute of table t1
> The problem in this resolving bug lies in the subquery resolving. In 
> ResolveSubquery rule, we first try to resolve subquery independently, and 
> only when there're unresolved attribute in subquery, we try 
> resolveOuterReferences. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to