[ https://issues.apache.org/jira/browse/SPARK-28220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303058#comment-17303058 ]
Apache Spark commented on SPARK-28220: -------------------------------------- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/31857 > join foldable condition not pushed down when parent filter is totally pushed > down > --------------------------------------------------------------------------------- > > Key: SPARK-28220 > URL: https://issues.apache.org/jira/browse/SPARK-28220 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.2, 3.0.0 > Reporter: liupengcheng > Priority: Major > > We encountered a issue that join conditions not pushed down when we are > running spark app on spark2.3, after carefully looking into the code and > debugging, we found that it's because there is a bug in the rule > `PushPredicateThroughJoin`: > It will try to push parent filter down though the join, however, when the > parent filter is wholly pushed down through the join, the join will become > the top node, and then the `transform` method will skip the join to apply the > rule. > > Suppose we have two tables: table1 and table2: > table1: (a: string, b: string, c: string) > table2: (d: string) > sql as: > > {code:java} > select * from table1 left join (select d, 'w1' as r from table2) on a = d and > r = 'w2' where b = 2{code} > > let's focus on the following optimizer rules: > PushPredicateThroughJoin > FodablePropagation > BooleanSimplification > PruneFilters > > In the above case, on the first iteration of these rules: > PushPredicateThroughJoin -> > {code:java} > select * from table1 where b=2 left join (select d, 'w1' as r from table2) on > a = d and r = 'w2' > {code} > FodablePropagation -> > {code:java} > select * from table1 where b=2 left join (select d, 'w1' as r from table2) on > a = d and 'w1' = 'w2'{code} > BooleanSimplification -> > {code:java} > select * from table1 where b=2 left join (select d, 'w1' as r from table2) on > false{code} > PruneFilters -> No effective > > After several iteration of these rules, the join condition will still never > be pushed to the > right hand of the left join. thus, in some case(e.g. Large right table), the > `BroadcastNestedLoopJoin` may be slow or oom. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org