On Sunday, 13 August 2017 at 09:56:44 UTC, Johan Engelen wrote:
On Sunday, 13 August 2017 at 09:15:48 UTC, amfvcg wrote:
Change the parameter for this array size to be taken from
stdin and I assume that these optimizations will go away.
This is paramount for all of the testing, examining, and
comparisons that are discussed in this thread.
Full information is given to the compiler, and you are
basically testing the constant folding power of the compilers
(not unimportant).
I agree that in general this is not the right way to benchmark. I
however am interested specifically in the pattern matching /
constant folding abilities
of the compiler. I would have expected `sum(iota(1, N + 1))` to
be replaced with `(N*(N+1))/2`. LDC already does this
optimization in some cases. I have opened an issue for some of
the rest: https://github.com/ldc-developers/ldc/issues/2271
No runtime calculation is needed for the sum. Your program
could be optimized to the following code:
```
void main()
{
MonoTime beg = MonoTime.currTime;
MonoTime end = MonoTime.currTime;
writeln(end-beg);
writeln(50000000);
}
```
So actually you should be more surprised that the reported time
is not equal to near-zero (just the time between two
`MonoTime.currTime` calls)!
On Posix, `MonoTime.currTime`'s implementation uses
clock_gettime(CLOCK_MONOTONIC, ...) which quite a bit more
involved than simply using the rdtsc instruciton on x86. See:
http://linuxmogeb.blogspot.bg/2013/10/how-does-clockgettime-work.html
On Windows, `MonoTime.currTime` uses QueryPerformanceCounter,
which on Win 7 and later uses the rdtsc instruction, which makes
it quite streamlined. In some testing I did several months ago
QueryPerformanceCounter had really good latency and precision
(though I forgot the exact numbers I got).
Instead of `iota(1,1000000)`, you should initialize the array
with random numbers with a randomization seed given by the user
(e.g. commandline argument or stdin). Then, the program will
actually have to do the runtime calculations that I assume you
are expecting it to perform.
Agreed, though I think Phobos's unpredictableSeed does an ok job
w.r.t. seeding, so unless you want to repeat the benchmark on the
exact same dataset, something like this does a good job:
T[] generate(T)(size_t size)
{
import std.algorithm.iteration : map;
import std.range : array, iota;
import std.random : uniform;
return size.iota.map!(_ => uniform!T()).array;
}