On Tue, 14 Apr 2026 11:21:27 GMT, Paul Hübner <[email protected]> wrote:
> Hi all,
>
> The Java FF&M API includes functionality to both initialize and read from
> thread-local data prior to and immediately after downcalls, respectively,
> through the `Linker.Option::captureCallState` API. This is useful, as an
> example, when setting or capturing `errno` when interfacing with C functions.
> However, using this linker option introduces some invocation overhead at
> runtime.
>
> This RFE introduces a JMH microbenchmark which quantifies this overhead. A
> simple downcall to `strtol` is measured with and without call state
> capturing.
>
> Testing: GHA for sanity testing.
>
> # Benchmarking Results
>
> I have executed this benchmark on Oracle-supported platforms, the results can
> be found below. For each platform, I've done two trials corresponding to
> different JDKs: one being the current HEAD ("current", based on
> 8357de88aa35ee998fefe321ee6dae9eb4993fa6), and one with
> [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) reverted
> ("legacy", revert commit on top of 8357de88aa35ee998fefe321ee6dae9eb4993fa6).
> This one-off experiment is insightful since
> [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) increased the
> overhead of downcalls using state capturing by introducing thread-local data
> initialization. The performance impact of this change was previously
> unquantified.
>
> ## Linux x64
>
> **Current:**
>
> Benchmark Mode Cnt Score
> Error Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 38.442 ±
> 0.016 ns/op
> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 45.425 ±
> 1.826 ns/op
>
>
> **Legacy:**
>
> Benchmark Mode Cnt Score
> Error Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 39.224 ±
> 0.789 ns/op
> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 41.011 ±
> 0.058 ns/op
>
>
> ## Linux AArch64
>
> **Current:**
>
> Benchmark Mode Cnt Score
> Error Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 45.396 ±
> 0.185 ns/op
> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 56.116 ±
> 0.463 ns/op
>
>
> **Legacy:**
>
> Benchmark Mode Cnt Score
> Error Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 44.859 ±
> 0.183 ns/op
> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 51.721 ±
> 0.153 ns/op
>
>
> ## macOS x64
>
> **Current:**
>
> Benchmark ...
test/micro/org/openjdk/bench/java/lang/foreign/CaptureCallStateOverheadBench.java
line 63:
> 61: @State(Scope.Benchmark)
> 62: @OutputTimeUnit(TimeUnit.NANOSECONDS)
> 63: @Fork(value = 3, jvmArgs =
> {"--add-exports=java.base/jdk.internal.foreign=ALL-UNNAMED",
The add exports seems not necessary; you are not using anything from this
package.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/30719#discussion_r3080647782