Howdy,
here's a creative way to deadlock a program: create and destroy 65500 and
some communicators and send a message on each of them:
----------------------------------------
#include <mpi.h>
#define CHECK(a) \
{ \
int err = (a); \
if (err != 0) std::cout << "Error in line " << __LINE__ << std::endl; \
}
int main (int argc, char *argv[])
{
int a=0, b;
MPI_Init (&argc, &argv);
for (int i=0; i<1000000; ++i)
{
if (i % 100 == 0) std::cout<< "Duplication event " << i << std::endl;
MPI_Comm dup;
CHECK(MPI_Comm_dup (MPI_COMM_WORLD, &dup));
CHECK(MPI_Allreduce(&a, &b, 1, MPI_INT, MPI_MIN, dup));
CHECK(MPI_Comm_free (&dup));
}
MPI_Finalize();
}
-------------------------------------------
If you run this, for example, on two processors with OpenMPI 1.2.6 or
1.3.2, you'll see that the program runs until after it produces 65500 as
output, and then just hangs -- on my system somewhere in the operating
system poll(), running full steam.
Since I take care of destroying the communicators again, I would have
expected this to work. I use creating many communicators basically as a
debugging tool: every object gets its own communicator to work on to
ensure that different objects don't communicate by accident with each
other just because they all use MPI_COMM_WORLD. It would be nice if this
mode of using MPI could be made to work.
Best & thanks in advance!
Wolfgang
--
-------------------------------------------------------------------------
Wolfgang Bangerth email: [email protected]
www: http://www.math.tamu.edu/~bangerth/