Shaun Jackman wrote:
On Tue, 2009-03-24 at 07:03 -0800, Eugene Loh wrote:
I'm not sure I understand this suggestion, so I'll say it the way I
understand it. Would it be possible for each process to send an
"all done" message to each of its neighbors? Conversely, each
process would poll its neighbors for messages, either processing
graph operations or collecting "all done" messages depending on
whether the message indicates a graph operation or signals "all done".
Ashley Pittman wrote:
Exactly, that way you have a defined number of messages which can be
calculated locally for each process and hence there is no need to use
Probe and you can get rid of the MPI_Barrier call.
Hi Eugene,
By `poll its neighbours', do you mean posting an MPI_Irecv for each
neighbour, and working until an `all done' message (sent using
MPI_Send) has been received from each neighbour?
Yes.
As long as each process posts its MPI_Irecv before starting the
MPI_Send, are we guaranteed that two processes won't deadlock by
MPI_Send-ing to each other?
Yes, I think so.
I avoided this method at first because I didn't understand that the
MPI_Irecv would stick around regardless of a message being ready to
receive; I figured that it had no effect if no message was ready to
receive.
Not sure I understand. Let's say you post an MPI_Irecv and you get
something in a follow-up MPI_Test or MPI_Wait... but it's not the "all
done" signal. Rather, it's a graph operation. So, you perform that
graph operation and then post the next MPI_Irecv. Something like
MPI_Irecv()
while () {
MPI_Wait()
if ( all_done ) break;
MPI_Irecv()
}
In this implementation, the graph is partitioned arbitrarily; the
vertices are distributed based on a hash function of each vertex's
unique ID, not based on the topology of the graph (which would be
nice, but difficult). So, every process is a neighbour of every other
process.
Okay.
I guess one other piece of advice is this. Start with something that
works; make sure it is easy to reason about its correctness. Doesn't
matter if there is excessive synchronization. Then, start running. If
oversynchronization proves to be a bottleneck, then fix it. But don't
fix it until you have data that indicates that it's a problem. I'm sure
the folks on this list can come up with all sorts of great, minimally
synchronizing algorithms, but maybe you can get by with schemes that are
simpler, more robust, etc.