KaiXinXIaoLei created SPARK-23540:
-------------------------------------

             Summary: The `where exists' action in optimized logical plan 
should be optimized 
                 Key: SPARK-23540
                 URL: https://issues.apache.org/jira/browse/SPARK-23540
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.3.0
            Reporter: KaiXinXIaoLei


The optimized logical plan of query 'select * from tt1 where exists (select *  
from tt2  where tt1.i = tt2.i);` is :

>== Optimized Logical Plan ==
>Join LeftSemi, (i#143 = i#145)
>:- MetastoreRelation default, tt1
>+- MetastoreRelation default, tt2

But the query of `select * from tt1 left semi join tt2 on tt2.i = tt1.i` is :

>== Optimized Logical Plan ==
 Join LeftSemi, (i#152 = i#150)
 :- Filter isnotnull(i#150)
 : +- MetastoreRelation default, tt1
 +- Project [i#152|#152]
 +- MetastoreRelation default, tt2

 

 So i think the  optimized logical plan of 'select * from tt1 where exists 
(select *  from tt2  where tt1.i = tt2.i);` should be further optimization.

 

== Optimized Logical Plan ==
 Join LeftSemi, (i#143 = i#145)
 :- MetastoreRelation default, tt1
 +- MetastoreRelation default, tt2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to