If I read the data correctly, for the count=100 case in jdk 20 it takes
109 ns/op for the array and 74 ns/op for the field.
To me this looks like a field access is _less_ expensive.
Am I missing something?
On 2023-08-16 13:37, Сергей Цыпанов wrote:
Hello,
I was measuring costs of hoisting volatile access out of the loop and found out, that
there's a difference in numbers for arrays and "plain" references.
Here's the benchmark for array:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(time = 2, iterations = 5)
@Measurement(time = 2, iterations = 5)
@Fork(value = 4, jvmArgs = "-Xmx1g")
public class VolatileArrayInLoopBenchmark {
@Benchmark
public int accessVolatileInLoop(Data data) {
int sum = 0;
for (int i = 0; i < data.count; i++) {
sum += data.ints[i];
}
return sum;
}
@Benchmark
public int hoistVolatileFromLoop(Data data) {
int sum = 0;
int[] ints = data.ints;
for (int i = 0; i < data.count; i++) {
sum += ints[i];
}
return sum;
}
@State(Scope.Benchmark)
public static class Data {
@Param({"1", "10", "100"})
private int count;
private volatile int[] ints;
@Setup
public void setUp() {
int[] ints = new int[count];
for (int i = 0; i < ints.length; i++) {
ints[i] = ThreadLocalRandom.current().nextInt();
}
this.ints = ints;
}
}
}
and this one is for reference:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(time = 2, iterations = 5)
@Measurement(time = 2, iterations = 5)
@Fork(value = 4, jvmArgs = "-Xmx1g")
public class VolatileFieldInLoopBenchmark {
@Benchmark
public int accessVolatileInLoop(Data data) {
int sum = 0;
for (int i = 0; i < data.count; i++) {
sum += data.value;
}
return sum;
}
@Benchmark
public int hoistVolatileFromLoop(Data data) {
int sum = 0;
int value = data.value;
for (int i = 0; i < data.count; i++) {
sum += value;
}
return sum;
}
@State(Scope.Benchmark)
public static class Data {
private final ThreadLocalRandom random = ThreadLocalRandom.current();
private volatile int value = random.nextInt();
@Param({"1", "10", "100"})
private int count;
}
}
From measurement results it looks like volatile array access is cheaper than
"plain" reference access:
Java 19
Benchmark (count) Mode Cnt Score Error Units
VolatileArrayInLoopBenchmark.accessVolatileInLoop 1 avgt 20 2.110 ± 0.404 ns/op
VolatileArrayInLoopBenchmark.accessVolatileInLoop 10 avgt 20 14.836 ± 2.825
ns/op
VolatileArrayInLoopBenchmark.accessVolatileInLoop 100 avgt 20 146.497 ± 25.786
ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 1 avgt 20 3.006 ± 0.686 ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 10 avgt 20 6.222 ± 1.215
ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 100 avgt 20 33.262 ± 6.579
ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 1 avgt 20 1.823 ± 0.382 ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 10 avgt 20 10.259 ± 2.874
ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 100 avgt 20 98.648 ± 18.500
ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 1 avgt 20 2.189 ± 0.412 ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 10 avgt 20 4.734 ± 0.891
ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 100 avgt 20 7.126 ± 1.309
ns/op
Java 20
Benchmark (count) Mode Cnt Score Error Units
VolatileArrayInLoopBenchmark.accessVolatileInLoop 1 avgt 20 1.714 ± 0.066 ns/op
VolatileArrayInLoopBenchmark.accessVolatileInLoop 10 avgt 20 10.703 ± 0.148
ns/op
VolatileArrayInLoopBenchmark.accessVolatileInLoop 100 avgt 20 109.001 ± 1.866
ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 1 avgt 20 2.408 ± 0.224 ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 10 avgt 20 4.678 ± 0.060
ns/op
VolatileArrayInLoopBenchmark.hoistVolatileFromLoop 100 avgt 20 24.711 ± 1.091
ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 1 avgt 20 1.366 ± 0.105 ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 10 avgt 20 7.388 ± 0.119 ns/op
VolatileFieldInLoopBenchmark.accessVolatileInLoop 100 avgt 20 74.630 ± 1.163
ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 1 avgt 20 1.653 ± 0.035 ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 10 avgt 20 3.138 ± 0.040
ns/op
VolatileFieldInLoopBenchmark.hoistVolatileFromLoop 100 avgt 20 4.945 ± 0.177
ns/op
So my question is why is volatile reference access is relatively more expensive
than volatile array access?