Allison Wang created SPARK-36117:
------------------------------------

             Summary: Join can become unresolved after 
PullupCorrelatedPredicates
                 Key: SPARK-36117
                 URL: https://issues.apache.org/jira/browse/SPARK-36117
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.2.0
            Reporter: Allison Wang


Join can become unresolved after PullupCorrelatedPredicates:
{code:sql}
create view t1(c1, c2) as values (0, 1), (1, 2)
create view t2(c1, c2) as values (0, 2), (0, 3)

select (
  select sum(l.cnt + r.cnt)
  from (select count(*) cnt from t2 where t1.c1 = t2.c1 having cnt = 0) l
  join (select count(*) cnt from t2 where t1.c1 = t2.c1 having cnt = 0) r
  on l.cnt = r.cnt
) from t1

== Optimized Logical Plan ==
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input '(' expecting {<EOF>, '.', '-'}(line 1, pos 14)

== SQL ==
scalarsubquery(c1, c1)
--------------^^^
{code}

This is because duplicate attributes are not handled correctly when pulling up 
correlated predicates over joins. Both `pullOutCorrelatedPredicates` and 
`DecorrelateInnerQuery` are subject to this issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to