Github user rednaxelafx commented on a diff in the pull request: https://github.com/apache/spark/pull/20224#discussion_r163453639 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala --- @@ -228,4 +229,21 @@ class WholeStageCodegenSuite extends QueryTest with SharedSQLContext { } } } + + test("including codegen stage ID in generated class name should not regress codegen caching") { + import testImplicits._ + + withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_USE_ID_IN_CLASS_NAME.key -> "true") { + val bytecodeSizeHisto = CodegenMetrics.METRIC_GENERATED_METHOD_BYTECODE_SIZE + spark.range(3).select('id + 2).collect + val after1 = bytecodeSizeHisto.getCount + spark.range(3).select('id + 2).collect + val after2 = bytecodeSizeHisto.getCount // same query shape as above, deliberately + assert(after1 == after2, "the same query run twice should hit the codegen cache") + + spark.range(5).select('id * 2).collect + val after3 = bytecodeSizeHisto.getCount + assert(after3 >= after2, "a different query can result in codegen cache miss, that's okay") --- End diff -- I actually deliberately wrote it this way. Note how I phrased in the assertion message as "can result in codegen cache miss" instead of "will result in". That's because the code shape of this third query was deliberately chosen to be similar to the two queries before it: all three have `spark.range(some_const).select(some_expr).collect`, so if any future changes to codegen of `Range` or `Project` operators affect how much specialized code (such as constant values) we directly embed into the code, it's actually possible to this third query to generate the same code as the first two, which will result in a codegen cache miss. So I'm making this check a bit loose. It's just there to indicate that it's acceptable to for a different query to encounter a codegen cache miss. WYDT?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org