[
https://issues.apache.org/jira/browse/SPARK-28220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17284653#comment-17284653
]
Yuming Wang commented on SPARK-28220:
-------------------------------------
It seems this is not a issue:
{code:sql}
postgres=# CREATE TABLE table1(a varchar(255), b varchar(255), c varchar(255));
CREATE TABLE
postgres=# CREATE TABLE table2(d varchar(255));
CREATE TABLE
postgres=# explain select * from table1 left join table2 on a = d and false;
QUERY PLAN
-----------------------------------------------------------------
Nested Loop Left Join (cost=0.00..11.00 rows=50 width=2064)
Join Filter: false
-> Seq Scan on table1 (cost=0.00..10.50 rows=50 width=1548)
-> Result (cost=0.00..0.00 rows=0 width=516)
One-Time Filter: false
(5 rows)
{code}
> join foldable condition not pushed down when parent filter is totally pushed
> down
> ---------------------------------------------------------------------------------
>
> Key: SPARK-28220
> URL: https://issues.apache.org/jira/browse/SPARK-28220
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.2, 3.0.0
> Reporter: liupengcheng
> Priority: Major
>
> We encountered a issue that join conditions not pushed down when we are
> running spark app on spark2.3, after carefully looking into the code and
> debugging, we found that it's because there is a bug in the rule
> `PushPredicateThroughJoin`:
> It will try to push parent filter down though the join, however, when the
> parent filter is wholly pushed down through the join, the join will become
> the top node, and then the `transform` method will skip the join to apply the
> rule.
>
> Suppose we have two tables: table1 and table2:
> table1: (a: string, b: string, c: string)
> table2: (d: string)
> sql as:
>
> {code:java}
> select * from table1 left join (select d, 'w1' as r from table2) on a = d and
> r = 'w2' where b = 2{code}
>
> let's focus on the following optimizer rules:
> PushPredicateThroughJoin
> FodablePropagation
> BooleanSimplification
> PruneFilters
>
> In the above case, on the first iteration of these rules:
> PushPredicateThroughJoin ->
> {code:java}
> select * from table1 where b=2 left join (select d, 'w1' as r from table2) on
> a = d and r = 'w2'
> {code}
> FodablePropagation ->
> {code:java}
> select * from table1 where b=2 left join (select d, 'w1' as r from table2) on
> a = d and 'w1' = 'w2'{code}
> BooleanSimplification ->
> {code:java}
> select * from table1 where b=2 left join (select d, 'w1' as r from table2) on
> false{code}
> PruneFilters -> No effective
>
> After several iteration of these rules, the join condition will still never
> be pushed to the
> right hand of the left join. thus, in some case(e.g. Large right table), the
> `BroadcastNestedLoopJoin` may be slow or oom.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]