Russel,

You're making this way more complicated than it needs to be, and I think unintentionally making the parallelism much more fine-grained than it should be.  (This may be my fault for not making the docs better.)  You manually create blocks for the various threads to work on, and then reduce(), etc. does the same thing again.  I've attached a version that's much simpler and more idiomatic and gives near-linear speedups.  I've also put something very similar in the documentation (http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html#reduce) as an example of using std.algorithm.map with TaskPool.reduce() and given you credit for the idea.

--Dave

On 3/6/2011 12:23 PM, Russel Winder wrote:
On Sun, 2011-03-06 at 11:57 -0500, David Simcha wrote:
Please post the full code somewhere.  The second one really should
scale better, and I want to understand in context how you're
parallelizing this.
No problem.  All the codes of all the variations in all the languages
are in a Bazaar branch which can be branched from
http://www.russel.org.uk/Bazaar/Pi_Quadrature or if you just want to
browse the the URL is http://www.russel.org.uk:8080/Pi_Quadrature/

This is a simple one stage scatter/gather, that is basically a large
number of additions partition to maximize use of processors.  It should
be embarrassingly parallel.

I use SCons as a compilation driver so as to not have to remember
lengthy command lines, but it is almost certainly the case that there
are a number of assumptions in the SConstruct file about location or
existence of environment variables.

_______________________________________________ phobos mailing list [email protected] http://lists.puremagic.com/mailman/listinfo/phobos

/*
 *  A D program to calculate Pi using quadrature as a parallel map algorithm.
 *
 *  Copyright © 2010--2011 Russel Winder
 */

//  std.parallelism is currently not in Phobos2, though it is being voted on 
for inclusion in Phobos2, so
//  ensure the compilation command takes care of all the factors to include the 
library.

import std.algorithm ;
import std.datetime ;
import std.parallelism ;
import std.range ;
import std.stdio ;
import std.typecons ;

void execute ( immutable int numberOfTasks ) {
  immutable n = 1000000000 ;
  immutable delta = 1.0 / n ;
  StopWatch stopWatch ;
  stopWatch.start ( ) ;
  immutable sliceSize = n / numberOfTasks ;

  real getTerm(int i) {
    immutable x = ( i - 0.5 ) * delta;
    return delta / ( 1.0 + x * x ) ;
  }

  immutable pi = 4.0 * taskPool.reduce!"a + b"(
    std.algorithm.map!getTerm(iota(n)), sliceSize
  );


  stopWatch.stop ( ) ;
  immutable elapseTime = stopWatch.peek ( ).hnsecs * 100e-9 ;
  writefln ( "==== D Parallel Map pi = %.18f" , pi ) ;
  writefln ( "==== D Parallel Map iteration count = %d" , n ) ;
  writefln ( "==== D Parallel Map elapse = %f" , elapseTime ) ;
  writefln ( "==== D Parallel Map task count = %d" , numberOfTasks ) ;
}

int main ( immutable string[] args ) {
  execute ( 1 ) ;
  writeln ( ) ;
  execute ( 2 ) ;
  writeln ( ) ;
  execute ( 8 ) ;
  writeln ( ) ;
  execute ( 32 ) ;
  return 0 ;
}
_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

Reply via email to