Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm test
suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,
                  MPI_COMM_WORLD, &Win);

  SendBuff = rank + 100;
  RecvBuff = 0;

  /* Accumulate to everyone, just for the heck of it */

  MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);
  for (i = 0; i < size; ++i)
    MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
  MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);


when the test fails, RecvBuff in (rank+100) instead of the accumulated
value (100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.
that being said, i found suspicious RecvBuff is initialized *after*
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc, and
vader is not the root cause)

Cheers,

Gilles

Reply via email to