bearophile wrote:
I have found an interesting small article about optimization, so I've tried the
code in C and D, and I have found strange results (the D code shows timings
opposite of the article).
This is the article, look at the "Branch Prediction" section:
http://www.ddj.com/184405848
The C code:
http://codepad.org/QSGIije4
And its asm (MinGW 4.2.1):
http://codepad.org/c7ZRiXGI
The similar D code:
http://codepad.org/slhcSJEA
Its asm (DMD 1.036):
http://codepad.org/AjlraEs9
There is also about 2X performance difference.
Bye,
bearophile
Are you running it on a Pentium 4? Pentium 4 has *horrific* branch
misprediction (minimum 24 cycles, 45 uops). No other processor is nearly
as bad, eg it's 15 cycles on Core2; it was just 4 cycles on PMMX.