Github user rednaxelafx commented on the issue: https://github.com/apache/spark/pull/20419 @kiszk SGTM and LGTM. Let's ship it! One more question on the side: with the `forceComment = true`, are we fully sure that won't affect the equality of `CodeAndComment`? The whole point of this PR is to have a way to embed the ID into the non-executable code of the generated code so that it won't affect the codegen cache hit, right? Could you please post an example of either the generated code or the metrics, e.g. a `SortMergeJoin` on two identical `spark.range(3)`s, and confirm that even when the two `range()`s are codegen's into different `codegenStageId`s, with the `spark.sql.codegen.useIdInClassName` turned off, the two stages would still hit the codegen cache? Basically to verify the example I gave in https://github.com/apache/spark/pull/20224#issuecomment-357091842 still hits the codegen cache when `spark.sql.codegen.useIdInClassName=false`.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org