Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15289 @JoshRosen In your example, we don't want to first count one million rows coming from the base table and then to return zero rows based on the false predicate in the outer query block. Instead, by pushing down the predicate to the base table, you do a pre-filtering and return zero rows early in the plan. Then you apply the aggregate followed by the the original predicate that will do the final filtering. Anyway, just some thought for further optimizations of predicates pushed down through aggregation. Also, a more realistic query would imply false predicates through predicate transitivity e.g. a != b and a =1 and b =1 => 1 != 1 So there might be some real customer queries that can take advantage of these optimizations.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org