On Sunday, 22 December 2013 at 21:07:11 UTC, Charles McAnany
wrote:
Friends,
I'm writing a little molecular simulator. Without boring you
with the details, here's the gist of it:
struct Atom{
double x, vx;
double interaction(Atom a2){
return (a2.x-this.x)^^2; //more complicated in reality
}
}
main(){
Atom[] atoms = (a bunch of atoms in random positions);
foreach(timestep; 1..1000){ //L0
foreach(atom; atoms){ //L1
foreach(partner; atoms){ //L2
atom.vx += atom.interaction(partner)/mass;
//F=ma
}
}
foreach(atom; atoms){ //L3
atom.x += atom.vx * deltaT;
}
}
}
So here's the conundrum: How do I parallelize this efficiently?
The first loop, L0, is not parallelizable at all, and I think
the best speedup will be in parallelizing L1. But I immediately
run into trouble: all the threads need access to all of atoms,
and every atom's position is changed on every pass through L0.
So to do this purely with message passing would involve copying
the entirety of atoms to every thread every L0 pass. Clearly,
shared state is desirable.
But I don't need to be careful about the shared state at all;
L1 only reads Atom.x, and only writes Atom.vx. L3 only reads
Atom.vx and only writes Atom.x There's no data dependency at
all inside L1 and L3.
Is there a way to inform the compiler of this without just
aggressively casting things to shared and immutable?
On that note, how do you pass a reference to a thread (via
send) without the compiler yelling at you? Do you
cast(immutable Atom[]) on send and cast(Atom[]) on receive?
If you're doing a range limited interaction, partition the atoms
spatially and have each core handle a fixed 3D volume. Check out
the NT method
http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/readings/shaw05_ntmethod.pdf
When the core that owns an atom detects that it may be in
interaction range for atom(s) owned by another core, send updates
to that other core.