nuno-faria commented on issue #16973: URL: https://github.com/apache/datafusion/issues/16973#issuecomment-3187272706
Are there also plans to support `RightSemi`? If the result inside the `IN` or `EXISTS` subquery is smaller I think the planner will chose it as the build side, meaning it will use a `RightSemi`. For example, using `IN`: ```sql explain analyze select * from orders where o_custkey in ( select c_custkey from customer where c_phone = '25-989-741-2988' ); HashJoinExec: mode=Partitioned, join_type=RightSemi, on=[(c_custkey@0, o_custkey@1)], filter=[o_custkey@1 >= 1 AND o_custkey@1 <= 1] ~1.2s ``` Similarly with `EXISTS`: ```sql explain analyze select * from orders where exists ( select 1 from customer where c_phone = '25-989-741-2988' and c_custkey = o_custkey ); HashJoinExec: mode=Partitioned, join_type=RightSemi, on=[(c_custkey@0, o_custkey@1)], filter=[o_custkey@1 >= 1 AND o_custkey@1 <= 1] ~1.2s ``` Meanwhile, an equivalent inner join now only takes around ~90ms, using dynamic filter pushdown. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org