The following code tries to send a message, but if it takes too long the
message is cancelled:
#define DEADLOCK_ABORT (30.0)
MPI_Isend(message, count, MPI_BYTE, comm_id,
MPI_MESSAGE_TAG, MPI_COMM_WORLD, &request);
t0 = time(NULL);
cancelled = FALSE;
while(TRUE)
{
//do some work
//test if message is delivered or cancelled
MPI_Test(&request, &flag, &status);
if (flag) break;
//test if it takes too long
t1 = time(NULL);
wall = difftime(t1, t0);
if (!cancelled and (wall > DEADLOCK_ABORT))
{
MPI_Cancel(&request);
cancelled = TRUE;
my_printf("cancelled!\n");
}
}
Now if I use a message size of about 5000 bytes and the message cannot be
delivered after DEADLOCK_ABORT seconds the MPI_Cancel is executed, but still
MPI_Test never returns TRUE, so it looks like the message cannot be cancelled
for some reason.
I am using OpenMPI 1.4.2 on Fedora Core 13.
Any ideas?
Thanks,
Gijsbert