[ https://issues.apache.org/jira/browse/SPARK-23542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
KaiXinXIaoLei updated SPARK-23542: ---------------------------------- Summary: The exists action shoule be further optimized in logical plan (was: The `where exists' action in optimized logical plan should be optimized ) > The exists action shoule be further optimized in logical plan > ------------------------------------------------------------- > > Key: SPARK-23542 > URL: https://issues.apache.org/jira/browse/SPARK-23542 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.0 > Reporter: KaiXinXIaoLei > Priority: Major > > The optimized logical plan of query '*select * from tt1 where exists (select > * from tt2 where tt1.i = tt2.i)*' is : > {code:java} > == Optimized Logical Plan == > Join LeftSemi, (i#14 = i#16) > :- HiveTableRelation `default`.`tt1`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [i#14, s#15] > +- Project [i#16] > +- HiveTableRelation `default`.`tt2`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [i#16, s#17]{code} > > But the query of `*select * from tt1 left semi join tt2 on tt2.i = tt1.i*` is > : > {noformat} > == Optimized Logical Plan == > Join LeftSemi, (i#22 = i#20) > :- Filter isnotnull(i#20) > : +- HiveTableRelation `default`.`tt1`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [i#20, s#21] > +- Project [i#22] > +- HiveTableRelation `default`.`tt2`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [i#22, s#23]{noformat} > > So i think the optimized logical plan of '*select * from tt1 where exists > (select * from tt2 where tt1.i = tt2.i)*;` should be further optimization. > {code:java} > == Optimized Logical Plan == > Join LeftSemi, (i#14 = i#16) > :- Filter isnotnull(i#20) > : +- HiveTableRelation `default`.`tt1`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [i#14, s#15] > +- Project [i#16] > +- HiveTableRelation `default`.`tt2`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [i#16, s#17]{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org