LuciferYang opened a new pull request, #55497: URL: https://github.com/apache/spark/pull/55497
### What changes were proposed in this pull request? Add a pre-warm pass (3 iterations of fresh-reader + initFromPage + decode) before the cold-reader `benchmark.addCase` call in `runBooleanBenchmark` and `runIntegerBenchmark` of `VectorizedRleValuesReaderBenchmark`. ### Why are the changes needed? Reviewer feedback on SPARK-56522 (PR #55386) flagged first-case `Best Time(ms) = 0` variance in Groups `runBooleanBenchmark`/`runIntegerBenchmark`: the first case in each group pays for tiered-compilation transitions on sub-millisecond iterations, producing inconsistent baseline numbers between re-runs. Groups `runNullableBatchBenchmark` don't show this because their setup reuses a pre-warmed reader before each `addCase`. The cold-reader variants in Groups `runBooleanBenchmark`/`runIntegerBenchmark` instantiate a fresh reader per iteration, so the shared pre-warm (`warmReader.readBooleans` / `warmReader.readIntegers`) doesn't fully cover the allocation + `initFromPage` path that `cold reader` exercises. Running the cold-reader code path explicitly 3 times before `addCase` lets HotSpot settle on C2 before measurement. ### Does this PR introduce _any_ user-facing change? No. Benchmark-only change. ### How was this patch tested? - Pass Github Actions ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
