Stanislav Blinov:
You mean with your current version of ldc?
Yes. The older version of LDC2 doesn't even compile the code. I
need to use 0.13.0-alpha1.
Your D code with small changes:
http://codepad.org/xqqScd42
Asm generated by G++ for the advance function (that is the one
that uses most of the run time):
http://codepad.org/tApRNsVy
Asm generated by ldc2:
http://codepad.org/jKSJcOAZ
With N = 5_000_000 my timings on an old CPU are 2.23 seconds for
ldc2 and 1.83 seconds for g++. So there's some performance
difference.
I have tried to unroll manually the loop in the D code, but I see
worse performance. I'll try some more later.
Bye,
bearophile