Re: Re: A question about broadcast nest loop join

2019-10-23 Thread Wenchen Fan
Ah sorry I made a mistake. "Spark can only pick BroadcastNestedLoopJoin to implement left/right join" this should be "left/right non-equal join" On Thu, Oct 24, 2019 at 6:32 AM zhangliyun wrote: > > Hi Herman: >I guess what you mentioned before > ``` > if you are OK with slightly different

Re: A question about broadcast nest loop join

2019-10-23 Thread angers . zhu
where not in ( query block)condition will been change to LeftSemi join in optimizer rule RewritePredicateSubquery. Then as cloud-fan said,  it will be change to a BroadCastNestLoopJoin

Re: A question about broadcast nest loop join

2019-10-23 Thread Wenchen Fan
I haven't looked into your query yet, just want to let you know that: Spark can only pick BroadcastNestedLoopJoin to implement left/right join. If the table is very big, then OOM happens. Maybe there is an algorithm to implement left/right join in a distributed environment without broadcast, but