[OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet
Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm test
suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,
  MPI_COMM_WORLD, &Win);

  SendBuff = rank + 100;
  RecvBuff = 0;

  /* Accumulate to everyone, just for the heck of it */

  MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);
  for (i = 0; i < size; ++i)
MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
  MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);


when the test fails, RecvBuff in (rank+100) instead of the accumulated
value (100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.
that being said, i found suspicious RecvBuff is initialized *after*
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc, and
vader is not the root cause)

Cheers,

Gilles


Re: [OMPI devel] c_accumulate

2015-04-20 Thread Rolf vandeVaart
Hi Gilles:
Is your failure similar to this ticket?
https://github.com/open-mpi/ompi/issues/393
Rolf

From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Gilles Gouaillardet
Sent: Monday, April 20, 2015 9:12 AM
To: Open MPI Developers
Subject: [OMPI devel] c_accumulate

Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm test 
suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :


MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,

MPI_COMM_WORLD, &Win);



  SendBuff = rank + 100;

  RecvBuff = 0;



  /* Accumulate to everyone, just for the heck of it */



  MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);

  for (i = 0; i < size; ++i)

MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);

  MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);

when the test fails, RecvBuff in (rank+100) instead of the accumulated value 
(100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.
that being said, i found suspicious RecvBuff is initialized *after* 
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc, and vader 
is not the root cause)

Cheers,

Gilles

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet

Hi Rolf,

yes, same issue ...

i attached a patch to the github issue ( the issue might be in the test).

From the standards (11.5 Synchronization Calls) :
"TheMPI_WIN_FENCE collective synchronization call supports a simple 
synchroniza-
tion pattern that is often used in parallel computations: namely a 
loosely-synchronous
model, where global computation phases alternate with global 
communication phases."


as far as i understand (disclaimer, i am *not* good at reading standards 
...) this is not
necessarily an MPI_Barrier, so there is a race condition in the test 
case that can be avoided

by adding an MPI_Barrier after initializing RecvBuff.

could someone (Jeff ? George ?) please double check this before i push a 
fix into ompi-tests repo ?


Cheers,

Gilles

On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:


Hi Gilles:

Is your failure similar to this ticket?

https://github.com/open-mpi/ompi/issues/393

Rolf

*From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles 
Gouaillardet

*Sent:* Monday, April 20, 2015 9:12 AM
*To:* Open MPI Developers
*Subject:* [OMPI devel] c_accumulate

Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm 
test suite on one host with 4 mpi tasks


so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,
 MPI_COMM_WORLD, &Win);
  
   SendBuff = rank + 100;

   RecvBuff = 0;
  
   /* Accumulate to everyone, just for the heck of it */
  
   MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);

   for (i = 0; i < size; ++i)
 MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
   MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);

when the test fails, RecvBuff in (rank+100) instead of the accumulated 
value (100 * nprocs + (nprocs -1)*nprocs/2


i am not familiar with onesided operations nor MPI_Win_fence.

that being said, i found suspicious RecvBuff is initialized *after* 
MPI_Win_create ...


does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc, 
and vader is not the root cause)


Cheers,

Gilles


This email message is for the sole use of the intended recipient(s) 
and may contain confidential information.  Any unauthorized review, 
use, disclosure or distribution is prohibited.  If you are not the 
intended recipient, please contact the sender by reply email and 
destroy all copies of the original message.




___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/04/17272.php




Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Hi Gilles, Nathan,

I read the MPI standard but I think the standard doesn't
require a barrier in the test program.

>From the standards (11.5.1 Fence) :

A fence call usually entails a barrier synchronization:
  a process completes a call to MPI_WIN_FENCE only after all
  other processes in the group entered their matching call.
  However, a call to MPI_WIN_FENCE that is known not to end
  any epoch (in particular, a call with assert equal to
  MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.

This sentence is misleading.

In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
in the MPI implementation to end access/exposure epochs.

In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
in the MPI implementation to end access/exposure epochs.
Also, a *global* barrier is not necessary in the MPI
implementation to start access/exposure epochs. But some
synchronizations are still needed to start an exposure epoch.

For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
can access the window on rank 1 before rank 2 or others
call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
I think this is the intent of the sentence in the MPI standard
cited above.

Thanks,
Takahiro Kawashima

> Hi Rolf,
> 
> yes, same issue ...
> 
> i attached a patch to the github issue ( the issue might be in the test).
> 
>  From the standards (11.5 Synchronization Calls) :
> "TheMPI_WIN_FENCE collective synchronization call supports a simple 
> synchroniza-
> tion pattern that is often used in parallel computations: namely a 
> loosely-synchronous
> model, where global computation phases alternate with global 
> communication phases."
> 
> as far as i understand (disclaimer, i am *not* good at reading standards 
> ...) this is not
> necessarily an MPI_Barrier, so there is a race condition in the test 
> case that can be avoided
> by adding an MPI_Barrier after initializing RecvBuff.
> 
> could someone (Jeff ? George ?) please double check this before i push a 
> fix into ompi-tests repo ?
> 
> Cheers,
> 
> Gilles
> 
> On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:
> >
> > Hi Gilles:
> >
> > Is your failure similar to this ticket?
> >
> > https://github.com/open-mpi/ompi/issues/393
> >
> > Rolf
> >
> > *From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles 
> > Gouaillardet
> > *Sent:* Monday, April 20, 2015 9:12 AM
> > *To:* Open MPI Developers
> > *Subject:* [OMPI devel] c_accumulate
> >
> > Folks,
> >
> > i (sometimes) get some failure with the c_accumulate test from the ibm 
> > test suite on one host with 4 mpi tasks
> >
> > so far, i was only able to observe this on linux/sparc with the vader btl
> >
> > here is a snippet of the test :
> >
> > MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,
> >  MPI_COMM_WORLD, &Win);
> >   
> >SendBuff = rank + 100;
> >RecvBuff = 0;
> >   
> >/* Accumulate to everyone, just for the heck of it */
> >   
> >MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);
> >for (i = 0; i < size; ++i)
> >  MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
> >MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);
> >
> > when the test fails, RecvBuff in (rank+100) instead of the accumulated 
> > value (100 * nprocs + (nprocs -1)*nprocs/2
> >
> > i am not familiar with onesided operations nor MPI_Win_fence.
> >
> > that being said, i found suspicious RecvBuff is initialized *after* 
> > MPI_Win_create ...
> >
> > does MPI_Win_fence implies MPI_Barrier ?
> >
> > if not, i guess RecvBuff should be initialized *before* MPI_Win_create.
> >
> > makes sense ?
> >
> > (and if it does make sense, then this issue is not related to sparc, 
> > and vader is not the root cause)
> >
> > Cheers,
> >
> > Gilles


Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet

Kawashima-san,

Nathan reached the same conclusion (see the github issue) and i fixed 
the test

by manually adding a MPI_Barrier.

Cheers,

Gilles

On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:

Hi Gilles, Nathan,

I read the MPI standard but I think the standard doesn't
require a barrier in the test program.

>From the standards (11.5.1 Fence) :

 A fence call usually entails a barrier synchronization:
   a process completes a call to MPI_WIN_FENCE only after all
   other processes in the group entered their matching call.
   However, a call to MPI_WIN_FENCE that is known not to end
   any epoch (in particular, a call with assert equal to
   MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.

This sentence is misleading.

In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
in the MPI implementation to end access/exposure epochs.

In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
in the MPI implementation to end access/exposure epochs.
Also, a *global* barrier is not necessary in the MPI
implementation to start access/exposure epochs. But some
synchronizations are still needed to start an exposure epoch.

For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
can access the window on rank 1 before rank 2 or others
call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
I think this is the intent of the sentence in the MPI standard
cited above.

Thanks,
Takahiro Kawashima


Hi Rolf,

yes, same issue ...

i attached a patch to the github issue ( the issue might be in the test).

  From the standards (11.5 Synchronization Calls) :
"TheMPI_WIN_FENCE collective synchronization call supports a simple
synchroniza-
tion pattern that is often used in parallel computations: namely a
loosely-synchronous
model, where global computation phases alternate with global
communication phases."

as far as i understand (disclaimer, i am *not* good at reading standards
...) this is not
necessarily an MPI_Barrier, so there is a race condition in the test
case that can be avoided
by adding an MPI_Barrier after initializing RecvBuff.

could someone (Jeff ? George ?) please double check this before i push a
fix into ompi-tests repo ?

Cheers,

Gilles

On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:

Hi Gilles:

Is your failure similar to this ticket?

https://github.com/open-mpi/ompi/issues/393

Rolf

*From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles
Gouaillardet
*Sent:* Monday, April 20, 2015 9:12 AM
*To:* Open MPI Developers
*Subject:* [OMPI devel] c_accumulate

Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm
test suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,
  MPI_COMM_WORLD, &Win);
   
SendBuff = rank + 100;

RecvBuff = 0;
   
/* Accumulate to everyone, just for the heck of it */
   
MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);

for (i = 0; i < size; ++i)
  MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);

when the test fails, RecvBuff in (rank+100) instead of the accumulated
value (100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.

that being said, i found suspicious RecvBuff is initialized *after*
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc,
and vader is not the root cause)

Cheers,

Gilles

___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/04/17289.php






Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Hi Gilles, Nathan,

No, my conclusion is that the MPI program does not need a MPI_Barrier
but MPI implementations need some synchronizations.

Thanks,
Takahiro Kawashima,

> Kawashima-san,
> 
> Nathan reached the same conclusion (see the github issue) and i fixed 
> the test
> by manually adding a MPI_Barrier.
> 
> Cheers,
> 
> Gilles
> 
> On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:
> > Hi Gilles, Nathan,
> >
> > I read the MPI standard but I think the standard doesn't
> > require a barrier in the test program.
> >
> > >From the standards (11.5.1 Fence) :
> >
> >  A fence call usually entails a barrier synchronization:
> >a process completes a call to MPI_WIN_FENCE only after all
> >other processes in the group entered their matching call.
> >However, a call to MPI_WIN_FENCE that is known not to end
> >any epoch (in particular, a call with assert equal to
> >MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.
> >
> > This sentence is misleading.
> >
> > In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
> > in the MPI implementation to end access/exposure epochs.
> >
> > In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
> > in the MPI implementation to end access/exposure epochs.
> > Also, a *global* barrier is not necessary in the MPI
> > implementation to start access/exposure epochs. But some
> > synchronizations are still needed to start an exposure epoch.
> >
> > For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
> > and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
> > can access the window on rank 1 before rank 2 or others
> > call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
> > I think this is the intent of the sentence in the MPI standard
> > cited above.
> >
> > Thanks,
> > Takahiro Kawashima
> >
> >> Hi Rolf,
> >>
> >> yes, same issue ...
> >>
> >> i attached a patch to the github issue ( the issue might be in the test).
> >>
> >>   From the standards (11.5 Synchronization Calls) :
> >> "TheMPI_WIN_FENCE collective synchronization call supports a simple
> >> synchroniza-
> >> tion pattern that is often used in parallel computations: namely a
> >> loosely-synchronous
> >> model, where global computation phases alternate with global
> >> communication phases."
> >>
> >> as far as i understand (disclaimer, i am *not* good at reading standards
> >> ...) this is not
> >> necessarily an MPI_Barrier, so there is a race condition in the test
> >> case that can be avoided
> >> by adding an MPI_Barrier after initializing RecvBuff.
> >>
> >> could someone (Jeff ? George ?) please double check this before i push a
> >> fix into ompi-tests repo ?
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:
> >>> Hi Gilles:
> >>>
> >>> Is your failure similar to this ticket?
> >>>
> >>> https://github.com/open-mpi/ompi/issues/393
> >>>
> >>> Rolf
> >>>
> >>> *From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles
> >>> Gouaillardet
> >>> *Sent:* Monday, April 20, 2015 9:12 AM
> >>> *To:* Open MPI Developers
> >>> *Subject:* [OMPI devel] c_accumulate
> >>>
> >>> Folks,
> >>>
> >>> i (sometimes) get some failure with the c_accumulate test from the ibm
> >>> test suite on one host with 4 mpi tasks
> >>>
> >>> so far, i was only able to observe this on linux/sparc with the vader btl
> >>>
> >>> here is a snippet of the test :
> >>>
> >>> MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,
> >>>   MPI_COMM_WORLD, &Win);
> >>>
> >>> SendBuff = rank + 100;
> >>> RecvBuff = 0;
> >>>
> >>> /* Accumulate to everyone, just for the heck of it */
> >>>
> >>> MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);
> >>> for (i = 0; i < size; ++i)
> >>>   MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, 
> >>> Win);
> >>> MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);
> >>>
> >>> when the test fails, RecvBuff in (rank+100) instead of the accumulated
> >>> value (100 * nprocs + (nprocs -1)*nprocs/2
> >>>
> >>> i am not familiar with onesided operations nor MPI_Win_fence.
> >>>
> >>> that being said, i found suspicious RecvBuff is initialized *after*
> >>> MPI_Win_create ...
> >>>
> >>> does MPI_Win_fence implies MPI_Barrier ?
> >>>
> >>> if not, i guess RecvBuff should be initialized *before* MPI_Win_create.
> >>>
> >>> makes sense ?
> >>>
> >>> (and if it does make sense, then this issue is not related to sparc,
> >>> and vader is not the root cause)


Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet

Kawashima-san,

i am confused ...

as you wrote :


In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
in the MPI implementation to end access/exposure epochs.



and the test case calls MPI_Win_fence with MPI_MODE_NOPRECEDE.

are you saying Open MPI implementation of MPI_Win_fence should perform
a barrier in this case (e.g. MPI_MODE_NOPRECEDE) ?

Cheers,

Gilles

On 4/21/2015 11:08 AM, Kawashima, Takahiro wrote:

Hi Gilles, Nathan,

No, my conclusion is that the MPI program does not need a MPI_Barrier
but MPI implementations need some synchronizations.

Thanks,
Takahiro Kawashima,


Kawashima-san,

Nathan reached the same conclusion (see the github issue) and i fixed
the test
by manually adding a MPI_Barrier.

Cheers,

Gilles

On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:

Hi Gilles, Nathan,

I read the MPI standard but I think the standard doesn't
require a barrier in the test program.

>From the standards (11.5.1 Fence) :

  A fence call usually entails a barrier synchronization:
a process completes a call to MPI_WIN_FENCE only after all
other processes in the group entered their matching call.
However, a call to MPI_WIN_FENCE that is known not to end
any epoch (in particular, a call with assert equal to
MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.

This sentence is misleading.

In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
in the MPI implementation to end access/exposure epochs.

In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
in the MPI implementation to end access/exposure epochs.
Also, a *global* barrier is not necessary in the MPI
implementation to start access/exposure epochs. But some
synchronizations are still needed to start an exposure epoch.

For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
can access the window on rank 1 before rank 2 or others
call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
I think this is the intent of the sentence in the MPI standard
cited above.

Thanks,
Takahiro Kawashima


Hi Rolf,

yes, same issue ...

i attached a patch to the github issue ( the issue might be in the test).

   From the standards (11.5 Synchronization Calls) :
"TheMPI_WIN_FENCE collective synchronization call supports a simple
synchroniza-
tion pattern that is often used in parallel computations: namely a
loosely-synchronous
model, where global computation phases alternate with global
communication phases."

as far as i understand (disclaimer, i am *not* good at reading standards
...) this is not
necessarily an MPI_Barrier, so there is a race condition in the test
case that can be avoided
by adding an MPI_Barrier after initializing RecvBuff.

could someone (Jeff ? George ?) please double check this before i push a
fix into ompi-tests repo ?

Cheers,

Gilles

On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:

Hi Gilles:

Is your failure similar to this ticket?

https://github.com/open-mpi/ompi/issues/393

Rolf

*From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles
Gouaillardet
*Sent:* Monday, April 20, 2015 9:12 AM
*To:* Open MPI Developers
*Subject:* [OMPI devel] c_accumulate

Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm
test suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(&RecvBuff, sizeOfInt, 1, MPI_INFO_NULL,
   MPI_COMM_WORLD, &Win);

 SendBuff = rank + 100;

 RecvBuff = 0;

 /* Accumulate to everyone, just for the heck of it */

 MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);

 for (i = 0; i < size; ++i)
   MPI_Accumulate(&SendBuff, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
 MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);

when the test fails, RecvBuff in (rank+100) instead of the accumulated
value (100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.

that being said, i found suspicious RecvBuff is initialized *after*
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc,
and vader is not the root cause)

___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/04/17292.php






Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Gilles,

Sorry for confusing you.

My understanding is:

MPI_WIN_FENCE has four roles regarding access/exposure epochs.

  - end access epoch
  - end exposure epoch
  - start access epoch
  - start exposure epoch

In order to end access/exposure epochs, a barrier is not needed
in the MPI implementation for MPI_MODE_NOPRECEDE.
But in order to start access/exposure epochs, synchronization
is still needed in the MPI implementation even for MPI_MODE_NOPRECEDE.

This synchronization (the latter case above) is not necessarily
a barrier. A peer-to-peer synchronization for the origin/target
pair is sufficient. But an easy implementation is using a barrier.

Thanks,
Takahiro Kawashima,

> Kawashima-san,
> 
> i am confused ...
> 
> as you wrote :
> 
> > In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
> > in the MPI implementation to end access/exposure epochs.
> 
> 
> and the test case calls MPI_Win_fence with MPI_MODE_NOPRECEDE.
> 
> are you saying Open MPI implementation of MPI_Win_fence should perform
> a barrier in this case (e.g. MPI_MODE_NOPRECEDE) ?
> 
> Cheers,
> 
> Gilles
> 
> On 4/21/2015 11:08 AM, Kawashima, Takahiro wrote:
> > Hi Gilles, Nathan,
> >
> > No, my conclusion is that the MPI program does not need a MPI_Barrier
> > but MPI implementations need some synchronizations.
> >
> > Thanks,
> > Takahiro Kawashima,
> >
> >> Kawashima-san,
> >>
> >> Nathan reached the same conclusion (see the github issue) and i fixed
> >> the test
> >> by manually adding a MPI_Barrier.
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:
> >>> Hi Gilles, Nathan,
> >>>
> >>> I read the MPI standard but I think the standard doesn't
> >>> require a barrier in the test program.
> >>>
> >>> >From the standards (11.5.1 Fence) :
> >>>
> >>>   A fence call usually entails a barrier synchronization:
> >>> a process completes a call to MPI_WIN_FENCE only after all
> >>> other processes in the group entered their matching call.
> >>> However, a call to MPI_WIN_FENCE that is known not to end
> >>> any epoch (in particular, a call with assert equal to
> >>> MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.
> >>>
> >>> This sentence is misleading.
> >>>
> >>> In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
> >>> in the MPI implementation to end access/exposure epochs.
> >>>
> >>> In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
> >>> in the MPI implementation to end access/exposure epochs.
> >>> Also, a *global* barrier is not necessary in the MPI
> >>> implementation to start access/exposure epochs. But some
> >>> synchronizations are still needed to start an exposure epoch.
> >>>
> >>> For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
> >>> and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
> >>> can access the window on rank 1 before rank 2 or others
> >>> call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
> >>> I think this is the intent of the sentence in the MPI standard
> >>> cited above.
> >>>
> >>> Thanks,
> >>> Takahiro Kawashima
> >>>
> >>>> Hi Rolf,
> >>>>
> >>>> yes, same issue ...
> >>>>
> >>>> i attached a patch to the github issue ( the issue might be in the test).
> >>>>
> >>>>From the standards (11.5 Synchronization Calls) :
> >>>> "TheMPI_WIN_FENCE collective synchronization call supports a simple
> >>>> synchroniza-
> >>>> tion pattern that is often used in parallel computations: namely a
> >>>> loosely-synchronous
> >>>> model, where global computation phases alternate with global
> >>>> communication phases."
> >>>>
> >>>> as far as i understand (disclaimer, i am *not* good at reading standards
> >>>> ...) this is not
> >>>> necessarily an MPI_Barrier, so there is a race condition in the test
> >>>> case that can be avoided
> >>>> by adding an MPI_Barrier after initializing RecvBuff.
> >>>>
> >>>> could someone (Jeff ? George ?) please double check this before i push a
> >>>> fi