agubichev commented on code in PR #45125: URL: https://github.com/apache/spark/pull/45125#discussion_r1514762032
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteWithExpression.scala: ########## @@ -34,7 +34,7 @@ import org.apache.spark.sql.catalyst.trees.TreePattern.{COMMON_EXPR_REF, WITH_EX */ object RewriteWithExpression extends Rule[LogicalPlan] { override def apply(plan: LogicalPlan): LogicalPlan = { - plan.transformWithPruning(_.containsPattern(WITH_EXPRESSION)) { + plan.transformDownWithSubqueriesAndPruning(_.containsPattern(WITH_EXPRESSION)) { Review Comment: This is needed when an Aggregate (count-bug-susceptible) has a WITH expression in it, because we skip optimizing it in OptimizeSubqueries. This only matters because RewriteWithExpression appears very early on in the optimization, and it is not triggered after we unnest the subquery. Any "later" rule would apply on the rewritten subquery, so it would not be a problem. For this "early" rules there are two options: i) repeat them after the subquery is fully unnested (i.e., the count bug has been handled, and the subquery is replaced with a join), or ii) make sure they run on subqueries. I chose option ii). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org