On Thu, 23 Apr 2026 16:24:53 GMT, Paul Hübner <[email protected]> wrote:

>> Hi all,
>> 
>> The Java FF&M API includes functionality to both initialize and read from 
>> thread-local data prior to and immediately after downcalls, respectively, 
>> through the `Linker.Option::captureCallState` API. This is useful, as an 
>> example, when setting or capturing `errno` when interfacing with C 
>> functions. However, using this linker option introduces some invocation 
>> overhead at runtime.
>> 
>> This RFE introduces a JMH microbenchmark which quantifies this overhead. A 
>> simple downcall to `strtol` is measured with and without call state 
>> capturing. 
>> 
>> Testing: GHA for sanity testing.
>> 
>> # Benchmarking Results
>> 
>> ⚠️ **These results apply prior to 
>> [28506ca](https://github.com/openjdk/jdk/pull/30719/commits/28506ca7a8f03cddb0903e4f58b4f38742f15606)!**
>> 
>> I have executed this benchmark on Oracle-supported platforms, the results 
>> can be found below. For each platform, I've done two trials corresponding to 
>> different JDKs: one being the current HEAD ("current", based on 
>> 8357de88aa35ee998fefe321ee6dae9eb4993fa6), and one with 
>> [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) reverted 
>> ("legacy", revert commit on top of 
>> 8357de88aa35ee998fefe321ee6dae9eb4993fa6). This one-off experiment is 
>> insightful since [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) 
>> increased the overhead of downcalls using state capturing by introducing 
>> thread-local data initialization. The performance impact of this change was 
>> previously unquantified.
>> 
>> ## Linux x64
>> 
>> **Current:**
>> 
>> Benchmark                                               Mode  Cnt   Score   
>> Error  Units
>> CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  38.442 ± 
>> 0.016  ns/op
>> CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  45.425 ± 
>> 1.826  ns/op
>> 
>> 
>> **Legacy:**
>> 
>> Benchmark                                               Mode  Cnt   Score   
>> Error  Units
>> CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  39.224 ± 
>> 0.789  ns/op
>> CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  41.011 ± 
>> 0.058  ns/op
>> 
>> 
>> ## Linux AArch64
>> 
>> **Current:**
>> 
>> Benchmark                                               Mode  Cnt   Score   
>> Error  Units
>> CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  45.396 ± 
>> 0.185  ns/op
>> CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  56.116 ± 
>> 0.463  ns/op
>> 
>> 
>> **Legacy:**
>> 
>> Benchmark                                               Mode  Cnt   Score   
>> Error  ...
>
> Paul Hübner has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Reviewer feedback.

LGTM. 

It would be good to publish the benchmark results both on the mailing list and 
then link to the mail here in the PR (for discoverability)

-------------

Marked as reviewed by pminborg (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/30719#pullrequestreview-4168621047

Reply via email to