Hi Rolf,
yes, same issue ...
i attached a patch to the github issue ( the issue might be in the test).
From the standards (11.5 Synchronization Calls) :
"TheMPI_WIN_FENCE collective synchronization call supports a simple
synchroniza-
tion pattern that is often used in parallel computations: namely a
loosely-synchronous
model, where global computation phases alternate with global
communication phases."
as far as i understand (disclaimer, i am *not* good at reading standards
...) this is not
necessarily an MPI_Barrier, so there is a race condition in the test
case that can be avoided
by adding an MPI_Barrier after initializing RecvBuff.
could someone (Jeff ? George ?) please double check this before i push a
fix into ompi-tests repo ?
Cheers,
Gilles
On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:
Hi Gilles:
Is your failure similar to this ticket?
https://github.com/open-mpi/ompi/issues/393
Rolf
*From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles
Gouaillardet
*Sent:* Monday, April 20, 2015 9:12 AM
*To:* Open MPI Developers
*Subject:* [OMPI devel] c_accumulate
Folks,
i (sometimes) get some failure with the c_accumulate test from the ibm
test suite on one host with 4 mpi tasks
so far, i was only able to observe this on linux/sparc with the vader btl
here is a snippet of the test :
MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,
MPI_COMM_WORLD, &Win);
SendBuff = rank + 100;
RecvBuff = 0;
/* Accumulate to everyone, just for the heck of it */
MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);
for (i = 0; i < size; ++i)
MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);
when the test fails, RecvBuff in (rank+100) instead of the accumulated
value (100 * nprocs + (nprocs -1)*nprocs/2
i am not familiar with onesided operations nor MPI_Win_fence.
that being said, i found suspicious RecvBuff is initialized *after*
MPI_Win_create ...
does MPI_Win_fence implies MPI_Barrier ?
if not, i guess RecvBuff should be initialized *before* MPI_Win_create.
makes sense ?
(and if it does make sense, then this issue is not related to sparc,
and vader is not the root cause)
Cheers,
Gilles
------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s)
and may contain confidential information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the
intended recipient, please contact the sender by reply email and
destroy all copies of the original message.
------------------------------------------------------------------------
_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2015/04/17272.php