On Thu, 23 Apr 2026 16:24:53 GMT, Paul Hübner <[email protected]> wrote:
>> Hi all, >> >> The Java FF&M API includes functionality to both initialize and read from >> thread-local data prior to and immediately after downcalls, respectively, >> through the `Linker.Option::captureCallState` API. This is useful, as an >> example, when setting or capturing `errno` when interfacing with C >> functions. However, using this linker option introduces some invocation >> overhead at runtime. >> >> This RFE introduces a JMH microbenchmark which quantifies this overhead. A >> simple downcall to `strtol` is measured with and without call state >> capturing. >> >> Testing: GHA for sanity testing. >> >> # Benchmarking Results >> >> ⚠️ **These results apply prior to >> [28506ca](https://github.com/openjdk/jdk/pull/30719/commits/28506ca7a8f03cddb0903e4f58b4f38742f15606)!** >> >> I have executed this benchmark on Oracle-supported platforms, the results >> can be found below. For each platform, I've done two trials corresponding to >> different JDKs: one being the current HEAD ("current", based on >> 8357de88aa35ee998fefe321ee6dae9eb4993fa6), and one with >> [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) reverted >> ("legacy", revert commit on top of >> 8357de88aa35ee998fefe321ee6dae9eb4993fa6). This one-off experiment is >> insightful since [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) >> increased the overhead of downcalls using state capturing by introducing >> thread-local data initialization. The performance impact of this change was >> previously unquantified. >> >> ## Linux x64 >> >> **Current:** >> >> Benchmark Mode Cnt Score >> Error Units >> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 38.442 ± >> 0.016 ns/op >> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 45.425 ± >> 1.826 ns/op >> >> >> **Legacy:** >> >> Benchmark Mode Cnt Score >> Error Units >> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 39.224 ± >> 0.789 ns/op >> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 41.011 ± >> 0.058 ns/op >> >> >> ## Linux AArch64 >> >> **Current:** >> >> Benchmark Mode Cnt Score >> Error Units >> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 45.396 ± >> 0.185 ns/op >> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 56.116 ± >> 0.463 ns/op >> >> >> **Legacy:** >> >> Benchmark Mode Cnt Score >> Error ... > > Paul Hübner has updated the pull request incrementally with one additional > commit since the last revision: > > Reviewer feedback. LGTM. It would be good to publish the benchmark results both on the mailing list and then link to the mail here in the PR (for discoverability) ------------- Marked as reviewed by pminborg (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/30719#pullrequestreview-4168621047
