> Hi all,
> 
> The Java FF&M API includes functionality to both initialize and read from 
> thread-local data prior to and immediately after downcalls, respectively, 
> through the `Linker.Option::captureCallState` API. This is useful, as an 
> example, when setting or capturing `errno` when interfacing with C functions. 
> However, using this linker option introduces some invocation overhead at 
> runtime.
> 
> This RFE introduces a JMH microbenchmark which quantifies this overhead. A 
> simple downcall to `strtol` is measured with and without call state 
> capturing. 
> 
> Testing: GHA for sanity testing.
> 
> # Benchmarking Results
> 
> I have executed this benchmark on Oracle-supported platforms, the results can 
> be found below. For each platform, I've done two trials corresponding to 
> different JDKs: one being the current HEAD ("current", based on 
> 8357de88aa35ee998fefe321ee6dae9eb4993fa6), and one with 
> [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) reverted 
> ("legacy", revert commit on top of 8357de88aa35ee998fefe321ee6dae9eb4993fa6). 
> This one-off experiment is insightful since 
> [JDK-8378559](https://bugs.openjdk.org/browse/JDK-8378559) increased the 
> overhead of downcalls using state capturing by introducing thread-local data 
> initialization. The performance impact of this change was previously 
> unquantified.
> 
> ## Linux x64
> 
> **Current:**
> 
> Benchmark                                               Mode  Cnt   Score   
> Error  Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  38.442 ± 
> 0.016  ns/op
> CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  45.425 ± 
> 1.826  ns/op
> 
> 
> **Legacy:**
> 
> Benchmark                                               Mode  Cnt   Score   
> Error  Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  39.224 ± 
> 0.789  ns/op
> CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  41.011 ± 
> 0.058  ns/op
> 
> 
> ## Linux AArch64
> 
> **Current:**
> 
> Benchmark                                               Mode  Cnt   Score   
> Error  Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  45.396 ± 
> 0.185  ns/op
> CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  56.116 ± 
> 0.463  ns/op
> 
> 
> **Legacy:**
> 
> Benchmark                                               Mode  Cnt   Score   
> Error  Units
> CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  44.859 ± 
> 0.183  ns/op
> CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  51.721 ± 
> 0.153  ns/op
> 
> 
> ## macOS x64
> 
> **Current:**
> 
> Benchmark ...

Paul Hübner has updated the pull request with a new target base due to a merge 
or a rebase. The incremental webrev excludes the unrelated changes brought in 
by the merge/rebase. The pull request contains five additional commits since 
the last revision:

 - Remove extraneous export.
 - Merge branch 'master' into JDK-8379630
 - Different long sizes on different platforms.
 - Use C standard library for the benchmark instead.
 - Benchmark to measure call state capturing overhead.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/30719/files
  - new: https://git.openjdk.org/jdk/pull/30719/files/128abe1d..9eda9bbd

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=30719&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=30719&range=00-01

  Stats: 31231 lines in 490 files changed: 12691 ins; 15665 del; 2875 mod
  Patch: https://git.openjdk.org/jdk/pull/30719.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/30719/head:pull/30719

PR: https://git.openjdk.org/jdk/pull/30719

Reply via email to