Re: Performance

Thomas via Digitalmars-d Sat, 31 May 2014 10:46:09 -0700

On Saturday, 31 May 2014 at 05:12:54 UTC, Marco Leise wrote:

Run this with: -O3 -frelease -fno-assert -fno-bounds-check-march=native

This way GCC and LLVM will recognize that you alternately add
p0 and p1 to the sum and partially unroll the loop, thereby
removing the condition. It takes 1.4xxxx nanoseconds per step
on my not so new 2.0 Ghz notebook, so I assume your PC will
easily reach parity with your original C++ version.




import std.stdio;
import core.time;

alias ℕ = size_t;

void main()
{
        run!plus(1_000_000_000);
}

double plus(ℕ steps)
{
        enum p0 = 0.0045;
        enum p1 = 1.00045452 - p0;

        double sum = 1.346346;
        foreach (i; 0 .. steps)
                sum += i%2 ? p1 : p0;
        return sum;
}

void run(alias func)(ℕ steps)
{
        auto t1 = TickDuration.currSystemTick;
        auto output = func(steps);
        auto t2 = TickDuration.currSystemTick;

auto nanotime = 1_000_000_000.0 / steps * (t2 - t1).length /TickDuration.ticksPerSec;

        writefln("Last: %s", output);
        writefln("Time per op: %s", nanotime);
        writeln();
}

Thank you for the help. Which OS is running on your notebook ?For I compiled your source code with your settings with the GCCcompiler. The run took 3.1xxxx nanoseconds per step. For the DMDcompiler the run took 5.xxxx nanoseconds. So I think the problemcould be specific to the linux versions of the GCC and the DMDcompilers.



Thomas

Re: Performance

Reply via email to