Jeff Squyres wrote:
FWIW, Open MPI does have on its long-term roadmap to have "blocking"
progress -- meaning that it'll (probably) spin aggressively for a
while and if nothing "interesting" is happening, it'll go into a
blocking mode and let the process block in some kind of OS call.
Although we have some interesting ideas on how to do this, it's not
entirely clear when we'll get this done. There's been a few requests
for this kind of feature before, but not a huge demand. This is
probably because most users running MPI jobs tend to devote the
entire core/CPU/server to the MPI job and don't try to run other
jobs concurrently on the same resources.
FWIW, I've run into the need for this a few times due to HPCC tests on
large (>100 MPI procs) nodes or multicore systems. HPCC (among other
things) looks at the performance of a single process while all other
np-1 processes spinwait -- or of a single pingpong pair while all other
np-2 processes wait. I'm not 100% sure what's going on, but I'm
guessing that the hard spinning of waiting processes hits the memory
system or some other resource, degrading the performance of working
processes. This is on nodes that are not oversubscribed.
So, I wonder if one might become more interested in less aggressive
waits when the node sizes increase.