Re: Casting MapResult

jmh530 via Digitalmars-d-learn Mon, 22 Jun 2015 18:31:15 -0700

On Tuesday, 16 June 2015 at 16:37:35 UTC, John Colvin wrote:

If you want really fast exponentiation of an array though, youwant to use SIMD. Something like http://www.yeppp.info would beeasy to use from D.

I've been looking into SIMD a little. It turns out that core.simdonly works for DMD on Linux machines. Not sure about the othercompilers, but I was a bit stuck for a little on it. I read alittle on SIMD as I had no real understanding of it before youmentioned it. At least I understand why all the types oncore.simd were so small. My initial reaction was there's no way Iwould want to write a code just for float[4], but now I'm like"oh that's the whole point".

Anyway, I might try to put something together on my other machineone of these days, but I was able to make a little bit moreprogress with D's std.parallelism. The foreach loops work great,even on Windows, with little extra work required.

That being said, I'm not seeing any speed-up from parallel map. Iput some code below doing some variations on std.algorithm.mapand taskPool.map. The more the memory allocation (through .array)the longer everything takes. Keeping things as ranges seems to bemuch faster.

The most interesting result to me was that the taskPool.map wasslower than std.algorithm.map in each case. Maybe a differencebetween being semi-eager versus lazy. The code below doesn't showit, but it seems like the parallel foreach loop is faster thanstd.algorithm.map or taskPool.map when doing everything witharrays.




import std.datetime;
import std.parallelism;
import std.conv : to;
import std.math : exp;
import std.stdio : writeln;
import std.array : array;
import std.range : iota;

enum real x_size = 100_000;

void f0()
{
        auto y = std.algorithm.map!(a => exp(a))(iota(x_size));
}

void f1()
{
        auto y = taskPool.map!exp(iota(x_size));
}

void f2()
{
        auto y = std.algorithm.map!(a => exp(a))(iota(x_size)).array;
}

void f3()
{
        auto y = taskPool.map!exp(iota(x_size)).array;
}

void f4()
{
        auto y = std.algorithm.map!(a => exp(a))(iota(x_size).array);
}

void f5()
{
        auto y = taskPool.map!exp(iota(x_size).array);
}

void f6()
{

auto y = std.algorithm.map!(a =>exp(a))(iota(x_size).array).array;

}

void f7()
{
        auto y = taskPool.map!exp(iota(x_size).array).array;
}

void main() {
        auto r = benchmark!(f0, f1, f2, f3, f4, f5, f6, f7)(100);
        auto f0Result = to!Duration(r[0]);
        auto f1Result = to!Duration(r[1]);
        auto f2Result = to!Duration(r[2]);
        auto f3Result = to!Duration(r[3]);
        auto f4Result = to!Duration(r[4]);
        auto f5Result = to!Duration(r[5]);
        auto f6Result = to!Duration(r[6]);
        auto f7Result = to!Duration(r[7]);
        writeln(f0Result);                      //prints ~ 17us on my machine
        writeln(f1Result);                      //prints ~ 4.3ms on my machine
        writeln(f2Result);                      //prints ~ 1.7s on my machine
        writeln(f3Result);                      //prints ~ 3.5s on my machine
        writeln(f4Result);                      //prints ~ 471ms on my machine
        writeln(f5Result);                      //prints ~ 473ms on my machine
        writeln(f6Result);                      //prints ~ 1.9s on my machine
        writeln(f7Result);                      //prints ~ 3.9s on my machine
}

Re: Casting MapResult

Reply via email to