epsio-banay commented on code in PR #12943:
URL: https://github.com/apache/datafusion/pull/12943#discussion_r1801471152


##########
datafusion/optimizer/src/push_down_filter.rs:
##########
@@ -685,7 +685,8 @@ impl OptimizerRule for PushDownFilter {
                         .map(LogicalPlan::Filter)?;
                 insert_below(LogicalPlan::Repartition(repartition), new_filter)
             }
-            LogicalPlan::Distinct(distinct) => {
+            LogicalPlan::Distinct(distinct @ Distinct::All(_)) => {
+                // note that we check for distinct all as distinct on is not 
commutable

Review Comment:
   You are right.
   I think the push down should be allowed only if the predicate contains only 
expressions that exist in the on_expr vec or constants. Do you know any tool or 
function that can help with that?
   
   Another thing I missed is that the push down should be allowed if the filter 
is on the same logical select with the distinct i.e. in this query push down 
should be allowed:
   
   `select distinct on (a) a, b from foo where b <> 2 order by a, b desc;`.
   
   I think this optimization bug occurs when you pushdown the filter into a 
subquery with distinct, for example:
   
   `select * from (select distinct on (a) a, b from foo order by a, b desc) sub 
where b <> 2;`
   
   I need to add tests for that aswell



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to