AssHero commented on code in PR #2750: URL: https://github.com/apache/arrow-datafusion/pull/2750#discussion_r905655191
########## datafusion/core/src/execution/context.rs: ########## @@ -1229,6 +1230,7 @@ impl SessionState { if config.config_options.get_bool(OPT_FILTER_NULL_JOIN_KEYS) { rules.push(Arc::new(FilterNullJoinKeys::default())); } + rules.push(Arc::new(ReduceOuterJoin::new())); Review Comment: After FilterPushdown, the filters like (c1 < 100) are distributed to tables. At this time, we need to check from the bottom to up, and this need more infos to reduce outer join. For query: select * from (select * from tt1 left join tt2 on c1 = c3) a right join tt3 on a.c2 = tt3.c5 where a.c4 < 100; after FilterPushdown, the logical plan is > | Projection: #a.c1, #a.c2, #a.c3, #a.c4, #tt3.c5, #tt3.c6 | > | Inner Join: #a.c2 = #tt3.c5 | > | Projection: #a.c1, #a.c2, #a.c3, #a.c4, alias=a | > | Projection: #tt1.c1, #tt1.c2, #tt2.c3, #tt2.c4, alias=a | > | Inner Join: #tt1.c1 = #tt2.c3 | > | TableScan: tt1 projection=Some([c1, c2]) | > | Filter: #tt2.c4 < Int64(100) | > | TableScan: tt2 projection=Some([c3, c4]) > | TableScan: tt3 projection=Some([c5, c6]) we need to walk to the tablescan, and get the filter, then back to the parent plan to reduce the outer joins, I think this is more complex and complicated to implement. Please let me know if I miss something? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org