[ https://issues.apache.org/jira/browse/SPARK-19712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-19712: ------------------------------------ Assignee: Apache Spark > EXISTS and Left Semi join do not produce the same plan > ------------------------------------------------------ > > Key: SPARK-19712 > URL: https://issues.apache.org/jira/browse/SPARK-19712 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 2.1.0 > Reporter: Nattavut Sutyanyong > Assignee: Apache Spark > > This problem was found during the development of SPARK-18874. > The EXISTS form in the following query: > {{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a where exists (select 1 > from t3 where t1.t1b=t3.t3b)")}} > gives the optimized plan below: > {code} > == Optimized Logical Plan == > Join Inner, (t1a#7 = t2a#25) > :- Join LeftSemi, (t1b#8 = t3b#58) > : :- Filter isnotnull(t1a#7) > : : +- Relation[t1a#7,t1b#8,t1c#9] parquet > : +- Project [1 AS 1#271, t3b#58] > : +- Relation[t3a#57,t3b#58,t3c#59] parquet > +- Filter isnotnull(t2a#25) > +- Relation[t2a#25,t2b#26,t2c#27] parquet > {code} > whereas a semantically equivalent Left Semi join query below: > {{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a left semi join t3 on > t1.t1b=t3.t3b")}} > gives the following optimized plan: > {code} > == Optimized Logical Plan == > Join LeftSemi, (t1b#8 = t3b#58) > :- Join Inner, (t1a#7 = t2a#25) > : :- Filter (isnotnull(t1b#8) && isnotnull(t1a#7)) > : : +- Relation[t1a#7,t1b#8,t1c#9] parquet > : +- Filter isnotnull(t2a#25) > : +- Relation[t2a#25,t2b#26,t2c#27] parquet > +- Project [t3b#58] > +- Relation[t3a#57,t3b#58,t3c#59] parquet > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org