asolimando commented on code in PR #18404:
URL: https://github.com/apache/datafusion/pull/18404#discussion_r2482165518


##########
datafusion/physical-plan/src/aggregates/mod.rs:
##########
@@ -1025,6 +1031,76 @@ impl ExecutionPlan for AggregateExec {
     fn cardinality_effect(&self) -> CardinalityEffect {
         CardinalityEffect::LowerEqual
     }
+

Review Comment:
   ```
   SELECT a, b, COUNT(*)
   FROM t
   WHERE $FILTER(a) AND $FILTER(b)
   GROUP BY GROUPING SETS ((a, b), (b))
   ```
   
   I think we are still being overly conservative here, we could either 
transform the filter expressions into CNF and split conjuncts, and analyze them 
separately.
   
   CNF expansion is worst-case exponential in the size of the input expression, 
therefore it can be very costly, we could at least split conjunctive filters 
as-is, so that individual safe conjuncts (`$FILTER(b)` in the example above) 
can be pushed down even if the whole conjunction can't.
   
   I am approving the PR (hoping this helps committers) as the current 
implementation is still conservative in some cases but totally sound, and it's 
already improving other the existing code, and the suggestion above can be 
tackled in a follow-up PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to