viirya commented on a change in pull request #32699: URL: https://github.com/apache/spark/pull/32699#discussion_r642617321
########## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ########## @@ -1039,21 +1039,25 @@ class CodegenContext extends Logging { def subexpressionEliminationForWholeStageCodegen(expressions: Seq[Expression]): SubExprCodes = { // Create a clear EquivalentExpressions and SubExprEliminationState mapping val equivalentExpressions: EquivalentExpressions = new EquivalentExpressions - val localSubExprEliminationExprs = mutable.HashMap.empty[Expression, SubExprEliminationState] + val localSubExprEliminationExprsForNonSplit = + mutable.HashMap.empty[Expression, SubExprEliminationState] // Add each expression tree and compute the common subexpressions. expressions.foreach(equivalentExpressions.addExprTree(_)) // Get all the expressions that appear at least twice and set up the state for subexpression // elimination. val commonExprs = equivalentExpressions.getAllEquivalentExprs(1) - lazy val commonExprVals = commonExprs.map(_.head.genCode(this)) - lazy val nonSplitExprCode = { - commonExprs.zip(commonExprVals).map { case (exprs, eval) => - // Generate the code for this expression tree. - val state = SubExprEliminationState(eval.isNull, eval.value) - exprs.foreach(localSubExprEliminationExprs.put(_, state)) + val nonSplitExprCode = { + commonExprs.map { exprs => + val eval = withSubExprEliminationExprs(localSubExprEliminationExprsForNonSplit.toMap) { Review comment: For each set of common expressions, `withSubExprEliminationExprs` only called once so I think it is not actually a recursive call? `withSubExprEliminationExprs` takes the given map used for subexpression elimination to replace common expression during expression codegen in the closure. It returns evaluated expression code (value/isNull/code). For the two subexpressions as example: 1. `simpleUDF($"id")` 2. `functions.length(simpleUDF($"id"))` 1st round `withSubExprEliminationExprs`: The map is empty. Gen code for `simpleUDF($"id")`. Put it into the map => (`simpleUDF($"id")` -> gen-ed code) 2nd round `withSubExprEliminationExprs`: Gen code for `functions.length(simpleUDF($"id"))`. Looking at the map and replace common expression `simpleUDF($"id")` as gen-ed code. Put it into the map => (`simpleUDF($"id")` -> gen-ed code, `functions.length(simpleUDF($"id"))` -> gen-ed code) The map will be used later for subexpression elimination. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org