AssHero commented on code in PR #2750:
URL: https://github.com/apache/arrow-datafusion/pull/2750#discussion_r905655191


##########
datafusion/core/src/execution/context.rs:
##########
@@ -1229,6 +1230,7 @@ impl SessionState {
         if config.config_options.get_bool(OPT_FILTER_NULL_JOIN_KEYS) {
             rules.push(Arc::new(FilterNullJoinKeys::default()));
         }
+        rules.push(Arc::new(ReduceOuterJoin::new()));

Review Comment:
   After FilterPushdown, the filters like (c1 < 100) are distributed to tables. 
At this time, we need to check from the bottom to up, and this need more infos 
to reduce outer join.
   
   For query: select * from (select * from tt1 left join tt2 on c1 = c3) a 
right join tt3 on a.c2 = tt3.c5 where a.c4 < 100;
   after FilterPushdown, the logical plan is
   
   > | Projection: #a.c1, #a.c2, #a.c3, #a.c4, #tt3.c5, #tt3.c6                 
                                                                |
   > |   Inner Join: #a.c2 = #tt3.c5                                            
                                                                |
   > |     Projection: #a.c1, #a.c2, #a.c3, #a.c4, alias=a                      
                                                                |
   > |       Projection: #tt1.c1, #tt1.c2, #tt2.c3, #tt2.c4, alias=a            
                                                                |
   > |         Inner Join: #tt1.c1 = #tt2.c3                                    
                                                                |
   > |           TableScan: tt1 projection=Some([c1, c2])                       
                                                                |
   > |           Filter: #tt2.c4 < Int64(100)                                   
                                                                |
   > |             TableScan: tt2 projection=Some([c3, c4])
   > |      TableScan: tt3 projection=Some([c5, c6])
   
   we need to walk to the tablescan, and get the filter, then back to the 
parent plan to reduce the outer joins, I think this is more complex and 
complicated to implement.
   
   Please let me know if I miss something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to