On 12/17/2015 02:34 AM, Ulf wrote: > I'm wondering why moving the increment operation to an extra line wound > enhance performance.
Because C1 is very straightforward, and code movement like that is a poor man's instruction scheduling, that pads out the data dependency between index update and indexed access. I don't think it deserves a comment -- it is expected one will run the benchmarks when changing that code. Thanks, -Aleksey