On Thursday, 23 June 2016 at 23:34:54 UTC, David Nadlinger wrote:
On Thursday, 23 June 2016 at 22:08:20 UTC, Seb wrote:
[1] https://github.com/wilzbach/perf-d/blob/master/test_pow.d
[2] https://github.com/wilzbach/perf-d/blob/master/test_powi.d

This is a bad way to benchmark. You are essentially testing the compiler's ability to propagate your constants into the benchmarking loop/hoisting the code to be benchmarked out of it.

For cross-compiler tests, you should define the candidate functions in a separate compilation units with an extern(C) interface to inhibit any optimisations. In this case, your code could e.g. be altered to:

---
import std.math : pow;
extern(C) long useStd(long a, long b) { return pow(a, b); }
extern(C) long useOp(long a, long b) { return a ^^ b; }
---
extern(C) long useStd(long a, long b);
extern(C) long useOp(long a, long b);

void main(string[] args)
{
    import std.datetime: benchmark, Duration;
    import std.stdio: writeln, writefln;
    import std.conv: to;

    long a = 5;
    long b = 25;
    long l = 0;

    void f0() { l += useStd(a, b); }
    void f1() { l += useOp(a, b); }

    auto rs = benchmark!(f0, f1)(100_000_000);

    foreach(j,r;rs)
    {
        version(GNU)
writefln("%d %d secs %d ms", j, r.seconds(), r.msecs());
        else
            writeln(j, " ", r.to!Duration);
    }

    // prevent any optimization
    writeln(l);
}
---
(Keeping track of the sum is of course no longer really necessary.)

I get the following results:

---
$ gdc -finline-functions -frelease -O3 -c test1.d
$ gdc -finline-functions -frelease -O3 test.d test1.o
$ ./a.out
0 0 secs 620 ms
1 0 secs 647 ms
4939766238266722816
---

---
$ ldc2 -O3 -release -c test1.d
$ ldc2 -O3 -release test.d test1.o
$ ./test
0 418 ms, 895 μs, and 3 hnsecs
1 409 ms, 776 μs, and 1 hnsec
4939766238266722816
---

---
$ dmd -O -release -inline -c test1.d
$ dmd -O -release -inline test.d test1.o
0 637 ms, 19 μs, and 9 hnsecs
1 722 ms, 57 μs, and 8 hnsecs
4939766238266722816
---

 — David

This is my preferred way of benchmarking as well, people often tell me cleverer ways but nothing gives me peace of mind like separate compilation without mangling!

Not wanting to pick on Seb in particular, but I see quite a lot of poor benchmarking on these forums from different people (myself included, not these days though I hope).

My biggest bugbear is actually the opposite of what you are point out here: people doing careful benchmarking and asm-inspection of small code-fragments in isolation when in reality it is always going to be inlined and optimised in context.

Reply via email to