WTF! Parallel foreach more slower that normal foreach in multicore CPU ?

Zardoz Thu, 23 Jun 2011 03:05:24 -0700

I'm trying std.parallelism, and I made this code (based over foreach parallel 
example) :
import std.stdio;
import std.parallelism;
import std.math;
import std.c.time;


void main () {
  auto logs = new double[20_000_000];
                const num = 10;

                clock_t clk;
                double norm;
                double par;

                writeln("CPUs : ",totalCPUs );

                clk = clock();
                foreach (t; 0..num) {

            foreach(i, ref elem; logs) {
                elem = log(i + 1.0);
            }
                }
                norm = clock() -clk;

                clk = clock();
                foreach (t; 0..num) {

            foreach(i, ref elem; taskPool.parallel(logs, 100)) {
                elem = log(i + 1.0);
            }

    }
                par = clock() -clk;

                norm = norm / num;
                par = par / num;

    writeln("Normal : ", norm / CLOCKS_PER_SEC, " Parallel : ", par / 
CLOCKS_PER_SEC);
}

I get this result :

CPUs : 2
Normal : 1.325 Parallel : 1.646

And the result changes, every time that I run it, around +-100ms (I think that 
depends of how are CPUs busy in these moment)

I played changin workUnitSize from 1 to 10000000 without any apreciable 
change....
My computer it's a AMD Athlon 64 X2 Dual Core Processor 6000+ running over a 
kUbuntu 11.04 64bits with 2 GiB of ram. I compiled it with dmd 2.053
htop shows that when test program are running parallel foreach, both cores are 
at ~98% of load and with normal foreach, only one core gets at ~99% of load.

WTF! Parallel foreach more slower that normal foreach in multicore CPU ?

Reply via email to