This is a microbenchmarks to demonstrate `Thread.onSpinWait` can be used to 
avoid heavy locks.
The microbenchmark differs from [Gil's original 
benchmark](https://github.com/giltene/GilExamples/tree/master/SpinWaitTest) and 
[Dmitry's 
variations](http://cr.openjdk.java.net/~dchuyko/8186670/yield/spinwait.html). 
Those benchmarks produce/consume data by incrementing a volatile counter. The 
latency of such operations is almost zero. They also don't use heavy locks. 
According to [Gil's 
SpinWaitTest.java](https://github.com/giltene/GilExamples/blob/master/SpinWaitTest/src/main/java/SpinWaitTest.java):
> This test can be used to measure and document the impact of 
> Runtime.onSpinWait() behavior
>  on thread-to-thread communication latencies. E.g. when the two threads are 
> pinned to
> the two hardware threads of a shared x86 core (with a shared L1), this test 
> will
> demonstrate an estimate the best case thread-to-thread latencies possible on 
> the
> platform

Gil's microbenchmark targets SMT cases (x86 hyperthreading). As not all CPUs 
support SMT, the microbenchmarks cannot demonstrate benefits of 
`Thread.onSpinWait`. It is actually opposite. They show `Thread.onSpinWait`  
has negative impact on performance.

The microbenchmark from PR uses `BigInteger` to have 100 - 200 ns latencies for 
producing/consuming data. These latencies can cause either a producer or a 
consumer to wait each another. Waiting is implemented with 
`Object.wait`/`Object.notify` which are heavy. `Thread.onSpinWait` can be used 
in a spin loop to avoid them.

**ARM64 results**:
- No spin loop

Benchmark                               (maxNum)  (spinNum)  Mode  Cnt     
Score    Error  Units
ThreadOnSpinWaitProducerConsumer.trial       100          0  avgt   75  
1520.448 ± 40.507  us/op

- No `Thread.onSpinWait` intrinsic

Benchmark                               (maxNum)  (spinNum)  Mode  Cnt     
Score    Error  Units
ThreadOnSpinWaitProducerConsumer.trial       100        125  avgt   75  
1580.756 ± 47.501  us/op

- `ISB`-based `Thread.onSpinWait` intrinsic

Benchmark                               (maxNum)  (spinNum)  Mode  Cnt    Score 
    Error  Units
ThreadOnSpinWaitProducerConsumer.trial       100        125  avgt   75  617.454 
± 174.431  us/op


**X86_64 results**:
- No spin loop

Benchmark                               (maxNum)  (spinNum)  Mode  Cnt    Score 
    Error  Units
ThreadOnSpinWaitProducerConsumer.trial      100        125  avgt   75  1417.944 
± 1.691  us/op

- No `Thread.onSpinWait` intrinsic

Benchmark                               (maxNum)  (spinNum)  Mode  Cnt    Score 
    Error  Units
ThreadOnSpinWaitProducerConsumer.trial      100        125  avgt   75  1410.987 
± 2.093  us/op

- `PAUSE`-based `Thread.onSpinWait` intrinsic

Benchmark                               (maxNum)  (spinNum)  Mode  Cnt    Score 
    Error  Units
ThreadOnSpinWaitProducerConsumer.trial      100        125  avgt   75  217.054 
± 1.283  us/op

-------------

Commit messages:
 - 8275728: Add simple Producer/Consumer microbenchmark for Thread.onSpinWait

Changes: https://git.openjdk.java.net/jdk/pull/6338/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6338&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8275728
  Stats: 204 lines in 1 file changed: 204 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6338.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6338/head:pull/6338

PR: https://git.openjdk.java.net/jdk/pull/6338

Reply via email to