Open JDKers, I am forwarding an email to get some clarification. It's been
a common understanding that foreach should perform no differently than the
equivalent for-loop . However, some fellow developers claim there is a
noticable difference in their microbenchmarking. Can you help explain what
is really going on? It's either the case there is a true difference (a
result that would surprise me) or the results are within a margin of error
that make the results insignificant. Please advise.

Cheers,
Paul

---------- Forwarded message ----------
From: Remko Popma <remko.po...@gmail.com>
Date: Fri, Sep 26, 2014 at 8:43 AM
Subject: Re: No for each loop comment?
To: Log4J Developers List <log4j-...@logging.apache.org>


On Windows it looks like normal for loops are slightly faster than for-each
loops, especially for small arrays of primitives. This could be noise,
since we are talking about 5 nanoseconds where the baseline (an empty
method invocation) is 12 nanos.

On Solaris 10 and Red Hat Enterprise Linux the baseline is so large (255
nanos and 1910 nanos respectively) that any difference we are seeing is
just noise.

All benchmarks were run with one fork, one thread, 10 warmup iterations and
10 test iterations.

*Windows 7 (64bit) with Java 1.8.0_05, 2-core Intel i5-3317u CPU @1.70Ghz
with hyperthreading switched on (4 virtual cores)*
Benchmark                                                Mode  Samples
 Score  Score error  Units
o.a.l.l.p.j.LoopsBenchmark.baseline                    sample   154947
12.432        0.550  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10000ForEachLoop    sample   126597
2759.592        3.431  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10000ForLoop        sample   126494
2761.729        3.127  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray1000ForEachLoop     sample   154124
 292.880        1.065  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray1000ForLoop         sample   156155
 288.751        1.101  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray100ForEachLoop      sample   191980
41.826        0.870  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray100ForLoop          sample   193770
36.894        0.782  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10ForEachLoop       sample   190847
22.393        0.618  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10ForLoop           sample   192552
17.146        0.560  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10000ForEachLoop    sample   173839
 31959.057       14.341  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10000ForLoop        sample   171137
 32461.985       14.353  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray1000ForEachLoop     sample    97495
3591.200        4.852  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray1000ForLoop         sample   101560
3445.998        4.010  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray100ForEachLoop      sample   102796
 438.207        1.923  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray100ForLoop          sample   102333
 439.576        2.139  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10ForEachLoop       sample   113924
58.957        1.247  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10ForLoop           sample   120416
60.712        1.284  ns/op


// For loops for Object arrays are similar but return the total XOR of the
element hashcodes.

private int forEachLoop(final int[] array) {
    int result = 0;
    for (final int element : array) {
        result ^= element;
    }
    return result;
}

private int forLoop(final int[] array) {
    int result = 0;
    for (int i = 0; i < array.length; i++) {
        result ^= array[i];
    }
    return result;
}



*Solaris 10 (64bit) with JDK1.7.0_06-b24 (Oracle Hotspot), 2 quad-core Xeon
X5570 dual CPUs @2.93Ghz with hyperthreading switched on (16 virtual cores)*
Benchmark                                                Mode  Samples
 Score  Score error  Units
o.a.l.l.p.j.LoopsBenchmark.baseline                    sample   110212
 255.300        0.201  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10000ForEachLoop    sample   187808
3938.055        1.207  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10000ForLoop        sample   187897
3937.929        0.748  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray1000ForEachLoop     sample   123989
 606.631        0.626  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray1000ForLoop         sample   123973
 609.565        0.416  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray100ForEachLoop      sample   126933
 294.204        0.280  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray100ForLoop          sample   127070
 296.411        0.223  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10ForEachLoop       sample   113400
 261.519        0.181  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10ForLoop           sample   111637
 260.435        0.115  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10000ForEachLoop    sample   115872
 48154.673       18.846  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10000ForLoop        sample   116777
 47793.868       17.615  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray1000ForEachLoop     sample   138432
5256.767        2.451  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray1000ForLoop         sample   136644
5325.377        2.388  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray100ForEachLoop      sample   166653
 773.541        0.330  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray100ForLoop          sample   166570
 774.513        0.574  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10ForEachLoop       sample   178754
 317.232        0.134  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10ForLoop           sample   180165
 316.189        0.238  ns/op

*64 bit RHEL 6.5 (Linux 2.6.32-431.el6.x86_64) with JDK1.7.0_05-b06 (Oracle
Hotspot), 4 quad-core Xeon X5570 CPUs @2.93GHz with hyperthreading switched
on (16 virtual cores)*
Benchmark                                                Mode  Samples
 Score  Score error  Units
o.a.l.l.p.j.LoopsBenchmark.baseline                    sample   114783
1910.576       29.256  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10000ForEachLoop    sample   194584
5132.885       25.137  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10000ForLoop        sample   196672
4811.572       52.072  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray1000ForEachLoop     sample   133119
1967.213       28.970  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray1000ForLoop         sample   133804
2004.501       31.554  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray100ForEachLoop      sample   142439
1575.329        6.457  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray100ForLoop          sample   142215
1957.714       27.815  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10ForEachLoop       sample   130826
1980.301       30.818  ns/op
o.a.l.l.p.j.LoopsBenchmark.intArray10ForLoop           sample   132654
1589.120        8.449  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10000ForEachLoop    sample   126947
 43301.320       50.589  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10000ForLoop        sample   126117
 43574.129       55.272  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray1000ForEachLoop     sample   143697
5831.250       19.667  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray1000ForLoop         sample   163244
4823.096       13.180  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray100ForEachLoop      sample   162502
1930.819       24.136  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray100ForLoop          sample   171619
1625.806       10.385  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10ForEachLoop       sample   172226
1888.683       22.554  ns/op
o.a.l.l.p.j.LoopsBenchmark.objArray10ForLoop           sample   188838
1581.979        6.322  ns/op




On Fri, Sep 26, 2014 at 2:56 AM, Matt Sicker <boa...@gmail.com> wrote:

> The foreach over an array looks like it's supposed to compile to the same
> thing:
>
> https://jcp.org/aboutJava/communityprocess/jsr/tiger/enhanced-for.html
>
> Same goes for .length which is supposed to be a final field which would
> allow for inlining by the JIT I'd imagine (hence why we use final
> everywhere):
>
> http://docs.oracle.com/javase/specs/jls/se7/html/jls-10.html#jls-10.7
>
>
>>>> On 24 September 2014 22:12, Gary Gregory <garydgreg...@gmail.com>
>>>> wrote:
>>>>
>>>>> Why does this "//noinspection ForLoopReplaceableByForEach" comment
>>>>> mean?
>>>>>
>>>>> Why not for an enhanced for each loop?
>>>>>
>>>>>  private static boolean contains(final Marker parent, final Marker...
>>>>> localParents) {
>>>>>             //noinspection ForLoopReplaceableByForEach
>>>>>             for (int i = 0, localParentsLength = localParents.length;
>>>>> i < localParentsLength; i++) {
>>>>>                 final Marker marker = localParents[i];
>>>>>                 if (marker == parent) {
>>>>>                     return true;
>>>>>                 }
>>>>>             }
>>>>>             return false;
>>>>>         }
>>>>>
>>>>> Thanks,
>>>>> Gary
>>>>>
>>>>>
>>>>>

Reply via email to