On Thu, 12 Mar 2026 18:15:40 GMT, Andrew Haley <[email protected]> wrote:
>> Please use [this >> link](https://github.com/openjdk/jdk/pull/28541/changes?w=1) to view the >> files changed. >> >> Profile counters scale very badly. >> >> The overhead for profiled code isn't too bad with one thread, but as the >> thread count increases, things go wrong very quickly. >> >> For example, here's a benchmark from the OpenJDK test suite, run at >> TieredLevel 3 with one thread, then three threads: >> >> >> Benchmark (randomized) Mode Cnt Score Error Units >> InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ± 2.631 ns/op >> InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ± 6.329 ns/op >> >> >> This slowdown is caused by high memory contention on the profile counters. >> Not only is this slow, but it can also lose profile counts. >> >> This patch is for C1 only. It'd be easy to randomize C1 counters as well in >> another PR, if anyone thinks it's worth doing. >> >> One other thing to note is that randomized profile counters degrade very >> badly with small decimation ratios. For example, using a ratio of 2 with >> `-XX:ProfileCaptureRatio=2` with a single thread results in >> >> >> Benchmark (randomized) Mode Cnt Score Error >> Units >> InterfaceCalls.test2ndInt5Types false avgt 4 80.147 ± 9.991 >> ns/op >> >> >> The problem is that the branch prediction rate drops away very badly, >> leading to many mispredictions. It only really makes sense to use higher >> decimation ratios, e.g. 64. > > Andrew Haley has updated the pull request incrementally with one additional > commit since the last revision: > > Cleanup FWIW, `java -jar dacapo-23.11-MR2-chopin.jar jython -s large` works fine on recent mainline. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-4056078553
