Scalability in std.parallelism

Nordlöw Sat, 22 Feb 2014 08:25:57 -0800

In the following test code given below of std.parallelism I getsome interesting results:


when compiled as

dmd -release -noboundscheck -O -inline -w -wi -wi~/Work/justd/t_parallelism.d -oft_parallelism


My scalability measures says the following

3.14159 took 221[ms]
3.14159 took 727[ms]
Speedup 3.28959
-5.80829e+09 took 33[ms]
-5.80829e+09 took 201[ms]
Speedup 6.09091

Why do I get a larger speed for a simpler map function?
Shouldn't it be the opposite?

I've always read that the more calculations I perform on eachmemory access the better the speedup...


Anyhow the speedups are great!

I'm sitting on a Intel Quad core with 8 hyperthreads.


Sample code follows:

import std.algorithm, std.parallelism, std.range, std.datetime,std.stdio;


void test1 () {
    immutable n = 100_000_000;
    immutable delta = 1.0 / n;

    auto piTerm(int i) {
        immutable x = (i - 0.5) * delta;
        return delta / (1.0 + x*x);
    }

    auto nums = n.iota.map!piTerm; // numbers
    StopWatch sw;

    sw.reset();
    sw.start();
    immutable pi = 4.0*taskPool.reduce!"a+b"(nums);
    sw.stop();
    immutable ms = sw.peek().msecs;
    writeln(pi, " took ", ms, "[ms]");

    sw.reset();
    sw.start();
    immutable pi_ = 4.0*std.algorithm.reduce!"a+b"(nums);
    sw.stop();
    immutable ms_ = sw.peek().msecs;
    writeln(pi_, " took ", ms_, "[ms]");

    writeln("Speedup ", cast(real)ms_ / ms);
}

auto square(T)(T i) @safe pure nothrow { return i*i; }

void test2 () {
    immutable n = 100_000_000;
    immutable delta = 1.0 / n;

    auto nums = n.iota.map!square; // numbers
    StopWatch sw;

    sw.reset();
    sw.start();
    immutable pi = 4.0*taskPool.reduce!"a+b"(nums);
    sw.stop();
    immutable ms = sw.peek().msecs;
    writeln(pi, " took ", ms, "[ms]");

    sw.reset();
    sw.start();
    immutable pi_ = 4.0*std.algorithm.reduce!"a+b"(nums);
    sw.stop();
    immutable ms_ = sw.peek().msecs;
    writeln(pi_, " took ", ms_, "[ms]");

    writeln("Speedup ", cast(real)ms_ / ms);
}

void main () {
    test1();
    test2();
}

Scalability in std.parallelism

Reply via email to