> Hi all,
>
> The Java FF&M API includes functionality to both initialize and read from
> thread-local data prior to and immediately after downcalls, respectively,
> through the `Linker.Option::captureCallState` API. This is useful, as an
> example, when setting or capturing `errno` when interfacing with C functions.
> However, using this linker option introduces some invocation overhead at
> runtime.
>
> This RFE introduces a JMH microbenchmark which quantifies this overhead. A
> simple downcall to `strtol` is measured with and without call state
> capturing.
>
> Testing: GHA for sanity testing.
>
> # Benchmarking Results
>
> I have executed this benchmark on Oracle-supported platforms, the results can
> be found below. For each platform, I've done two trials corresponding to
> different JDKs: one being the current HEAD ("current", based on
> 8357de88aa35ee998fefe321ee6dae9eb4993fa6), and one with
> [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) reverted
> ("legacy", revert commit on top of 8357de88aa35ee998fefe321ee6dae9eb4993fa6).
> This one-off experiment is insightful since
> [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) increased the
> overhead of downcalls using state capturing by introducing thread-local data
> initialization. The performance impact of this change was previously
> unquantified.
>
> ## Linux x64
>
> **Current:**
>
> Benchmark Mode Cnt Score
> Error Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 38.442 ±
> 0.016 ns/op
> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 45.425 ±
> 1.826 ns/op
>
>
> **Legacy:**
>
> Benchmark Mode Cnt Score
> Error Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 39.224 ±
> 0.789 ns/op
> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 41.011 ±
> 0.058 ns/op
>
>
> ## Linux AArch64
>
> **Current:**
>
> Benchmark Mode Cnt Score
> Error Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 45.396 ±
> 0.185 ns/op
> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 56.116 ±
> 0.463 ns/op
>
>
> **Legacy:**
>
> Benchmark Mode Cnt Score
> Error Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState avgt 30 44.859 ±
> 0.183 ns/op
> CaptureCallStateOverheadBench.useCaptureCallState avgt 30 51.721 ±
> 0.153 ns/op
>
>
> ## macOS x64
>
> **Current:**
>
> Benchmark ...
Paul Hübner has updated the pull request with a new target base due to a merge
or a rebase. The incremental webrev excludes the unrelated changes brought in
by the merge/rebase. The pull request contains five additional commits since
the last revision:
- Remove extraneous export.
- Merge branch 'master' into JDK-8379630
- Different long sizes on different platforms.
- Use C standard library for the benchmark instead.
- Benchmark to measure call state capturing overhead.
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/30719/files
- new: https://git.openjdk.org/jdk/pull/30719/files/128abe1d..9eda9bbd
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=30719&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=30719&range=00-01
Stats: 31231 lines in 490 files changed: 12691 ins; 15665 del; 2875 mod
Patch: https://git.openjdk.org/jdk/pull/30719.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/30719/head:pull/30719
PR: https://git.openjdk.org/jdk/pull/30719