Ah sorry I made a mistake. "Spark can only pick BroadcastNestedLoopJoin to
implement left/right join" this should be "left/right non-equal join"
On Thu, Oct 24, 2019 at 6:32 AM zhangliyun wrote:
>
> Hi Herman:
>I guess what you mentioned before
> ```
> if you are OK with slightly different
Hi Herman:
I guess what you mentioned before
```
if you are OK with slightly different NULL semantics then you could use NOT
EXISTS(subquery). The latter should perform a lot better.
```
is the NULL key1 of left table will be retained if NULL key2 is not found in
the right table ( join
Hi all:
From google , I know that:
Spark can only pick BroadcastNestedLoopJoin to implement left/right join.
but why I use following case , broascastnestedLoopJoin became Sortmerged
join when set spark.sql.autoBroadcastJoinThreshold=-1;
{code}
set
where not in ( query block)condition will been change to LeftSemi join in optimizer rule RewritePredicateSubquery. Then as cloud-fan said, it will be change to a BroadCastNestLoopJoin
I haven't looked into your query yet, just want to let you know that: Spark
can only pick BroadcastNestedLoopJoin to implement left/right join. If the
table is very big, then OOM happens.
Maybe there is an algorithm to implement left/right join in a distributed
environment without broadcast, but
Hi all:
i want to ask a question about broadcast nestloop join? from google i know,
that
left outer/semi join and right outer/semi join will use broadcast nestloop.
and in some cases, when the input data is very small, it is suitable to use.
so here
how to define the input data very
unsubscribe