LuciferYang opened a new pull request, #55497:
URL: https://github.com/apache/spark/pull/55497

   ### What changes were proposed in this pull request?
   
   Add a pre-warm pass (3 iterations of fresh-reader + initFromPage + decode) 
before the cold-reader `benchmark.addCase` call in `runBooleanBenchmark` and 
`runIntegerBenchmark` of `VectorizedRleValuesReaderBenchmark`.
   
   ### Why are the changes needed?
   
   Reviewer feedback on SPARK-56522 (PR #55386) flagged first-case `Best 
Time(ms) = 0` variance in Groups `runBooleanBenchmark`/`runIntegerBenchmark`: 
the first case in each group pays for tiered-compilation transitions on 
sub-millisecond iterations, producing inconsistent baseline numbers between 
re-runs.
   
   Groups `runNullableBatchBenchmark` don't show this because their setup 
reuses a pre-warmed reader before each `addCase`. The cold-reader variants in 
Groups `runBooleanBenchmark`/`runIntegerBenchmark` instantiate a fresh reader 
per iteration, so the shared pre-warm (`warmReader.readBooleans` / 
`warmReader.readIntegers`) doesn't fully cover the allocation + `initFromPage` 
path that `cold reader` exercises. Running the cold-reader code path explicitly 
3 times before `addCase` lets HotSpot settle on C2 before measurement.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. Benchmark-only change.
   
   ### How was this patch tested?
   - Pass Github Actions
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Opus 4.7
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to