Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Gilles,

Sorry for confusing you.

My understanding is:

MPI_WIN_FENCE has four roles regarding access/exposure epochs.

  - end access epoch
  - end exposure epoch
  - start access epoch
  - start exposure epoch

In order to end access/exposure epochs, a barrier is not needed
in the MPI implementation for MPI_MODE_NOPRECEDE.
But in order to start access/exposure epochs, synchronization
is still needed in the MPI implementation even for MPI_MODE_NOPRECEDE.

This synchronization (the latter case above) is not necessarily
a barrier. A peer-to-peer synchronization for the origin/target
pair is sufficient. But an easy implementation is using a barrier.

Thanks,
Takahiro Kawashima,

> Kawashima-san,
> 
> i am confused ...
> 
> as you wrote :
> 
> > In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
> > in the MPI implementation to end access/exposure epochs.
> 
> 
> and the test case calls MPI_Win_fence with MPI_MODE_NOPRECEDE.
> 
> are you saying Open MPI implementation of MPI_Win_fence should perform
> a barrier in this case (e.g. MPI_MODE_NOPRECEDE) ?
> 
> Cheers,
> 
> Gilles
> 
> On 4/21/2015 11:08 AM, Kawashima, Takahiro wrote:
> > Hi Gilles, Nathan,
> >
> > No, my conclusion is that the MPI program does not need a MPI_Barrier
> > but MPI implementations need some synchronizations.
> >
> > Thanks,
> > Takahiro Kawashima,
> >
> >> Kawashima-san,
> >>
> >> Nathan reached the same conclusion (see the github issue) and i fixed
> >> the test
> >> by manually adding a MPI_Barrier.
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:
> >>> Hi Gilles, Nathan,
> >>>
> >>> I read the MPI standard but I think the standard doesn't
> >>> require a barrier in the test program.
> >>>
> >>> >From the standards (11.5.1 Fence) :
> >>>
> >>>   A fence call usually entails a barrier synchronization:
> >>> a process completes a call to MPI_WIN_FENCE only after all
> >>> other processes in the group entered their matching call.
> >>> However, a call to MPI_WIN_FENCE that is known not to end
> >>> any epoch (in particular, a call with assert equal to
> >>> MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.
> >>>
> >>> This sentence is misleading.
> >>>
> >>> In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
> >>> in the MPI implementation to end access/exposure epochs.
> >>>
> >>> In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
> >>> in the MPI implementation to end access/exposure epochs.
> >>> Also, a *global* barrier is not necessary in the MPI
> >>> implementation to start access/exposure epochs. But some
> >>> synchronizations are still needed to start an exposure epoch.
> >>>
> >>> For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
> >>> and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
> >>> can access the window on rank 1 before rank 2 or others
> >>> call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
> >>> I think this is the intent of the sentence in the MPI standard
> >>> cited above.
> >>>
> >>> Thanks,
> >>> Takahiro Kawashima
> >>>
>  Hi Rolf,
> 
>  yes, same issue ...
> 
>  i attached a patch to the github issue ( the issue might be in the test).
> 
> From the standards (11.5 Synchronization Calls) :
>  "TheMPI_WIN_FENCE collective synchronization call supports a simple
>  synchroniza-
>  tion pattern that is often used in parallel computations: namely a
>  loosely-synchronous
>  model, where global computation phases alternate with global
>  communication phases."
> 
>  as far as i understand (disclaimer, i am *not* good at reading standards
>  ...) this is not
>  necessarily an MPI_Barrier, so there is a race condition in the test
>  case that can be avoided
>  by adding an MPI_Barrier after initializing RecvBuff.
> 
>  could someone (Jeff ? George ?) please double check this before i push a
>  fix into ompi-tests repo ?
> 
>  Cheers,
> 
>  Gilles
> 
>  On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:
> > Hi Gilles:
> >
> > Is your failure similar to this ticket?
> >
> > https://github.com/open-mpi/ompi/issues/393
> >
> > Rolf
> >
> > *From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles
> > Gouaillardet
> > *Sent:* Monday, April 20, 2015 9:12 AM
> > *To:* Open MPI Developers
> > *Subject:* [OMPI devel] c_accumulate
> >
> > Folks,
> >
> > i (sometimes) get some failure with the c_accumulate test from the ibm
> > test suite on one host with 4 mpi tasks
> >
> > so far, i was only able to observe this on linux/sparc with the vader 
> > btl
> >
> > here is a snippet of the test :
> >
> > MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,
> >MPI_COMM_WORLD, );
> > 
> >  SendBuff = rank + 100;
> >  

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet

Kawashima-san,

i am confused ...

as you wrote :


In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
in the MPI implementation to end access/exposure epochs.



and the test case calls MPI_Win_fence with MPI_MODE_NOPRECEDE.

are you saying Open MPI implementation of MPI_Win_fence should perform
a barrier in this case (e.g. MPI_MODE_NOPRECEDE) ?

Cheers,

Gilles

On 4/21/2015 11:08 AM, Kawashima, Takahiro wrote:

Hi Gilles, Nathan,

No, my conclusion is that the MPI program does not need a MPI_Barrier
but MPI implementations need some synchronizations.

Thanks,
Takahiro Kawashima,


Kawashima-san,

Nathan reached the same conclusion (see the github issue) and i fixed
the test
by manually adding a MPI_Barrier.

Cheers,

Gilles

On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:

Hi Gilles, Nathan,

I read the MPI standard but I think the standard doesn't
require a barrier in the test program.

>From the standards (11.5.1 Fence) :

  A fence call usually entails a barrier synchronization:
a process completes a call to MPI_WIN_FENCE only after all
other processes in the group entered their matching call.
However, a call to MPI_WIN_FENCE that is known not to end
any epoch (in particular, a call with assert equal to
MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.

This sentence is misleading.

In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
in the MPI implementation to end access/exposure epochs.

In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
in the MPI implementation to end access/exposure epochs.
Also, a *global* barrier is not necessary in the MPI
implementation to start access/exposure epochs. But some
synchronizations are still needed to start an exposure epoch.

For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
can access the window on rank 1 before rank 2 or others
call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
I think this is the intent of the sentence in the MPI standard
cited above.

Thanks,
Takahiro Kawashima


Hi Rolf,

yes, same issue ...

i attached a patch to the github issue ( the issue might be in the test).

   From the standards (11.5 Synchronization Calls) :
"TheMPI_WIN_FENCE collective synchronization call supports a simple
synchroniza-
tion pattern that is often used in parallel computations: namely a
loosely-synchronous
model, where global computation phases alternate with global
communication phases."

as far as i understand (disclaimer, i am *not* good at reading standards
...) this is not
necessarily an MPI_Barrier, so there is a race condition in the test
case that can be avoided
by adding an MPI_Barrier after initializing RecvBuff.

could someone (Jeff ? George ?) please double check this before i push a
fix into ompi-tests repo ?

Cheers,

Gilles

On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:

Hi Gilles:

Is your failure similar to this ticket?

https://github.com/open-mpi/ompi/issues/393

Rolf

*From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles
Gouaillardet
*Sent:* Monday, April 20, 2015 9:12 AM
*To:* Open MPI Developers
*Subject:* [OMPI devel] c_accumulate

Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm
test suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,
   MPI_COMM_WORLD, );

 SendBuff = rank + 100;

 RecvBuff = 0;

 /* Accumulate to everyone, just for the heck of it */

 MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);

 for (i = 0; i < size; ++i)
   MPI_Accumulate(, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
 MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);

when the test fails, RecvBuff in (rank+100) instead of the accumulated
value (100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.

that being said, i found suspicious RecvBuff is initialized *after*
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc,
and vader is not the root cause)

___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/04/17292.php






Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Hi Gilles, Nathan,

No, my conclusion is that the MPI program does not need a MPI_Barrier
but MPI implementations need some synchronizations.

Thanks,
Takahiro Kawashima,

> Kawashima-san,
> 
> Nathan reached the same conclusion (see the github issue) and i fixed 
> the test
> by manually adding a MPI_Barrier.
> 
> Cheers,
> 
> Gilles
> 
> On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:
> > Hi Gilles, Nathan,
> >
> > I read the MPI standard but I think the standard doesn't
> > require a barrier in the test program.
> >
> > >From the standards (11.5.1 Fence) :
> >
> >  A fence call usually entails a barrier synchronization:
> >a process completes a call to MPI_WIN_FENCE only after all
> >other processes in the group entered their matching call.
> >However, a call to MPI_WIN_FENCE that is known not to end
> >any epoch (in particular, a call with assert equal to
> >MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.
> >
> > This sentence is misleading.
> >
> > In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
> > in the MPI implementation to end access/exposure epochs.
> >
> > In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
> > in the MPI implementation to end access/exposure epochs.
> > Also, a *global* barrier is not necessary in the MPI
> > implementation to start access/exposure epochs. But some
> > synchronizations are still needed to start an exposure epoch.
> >
> > For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
> > and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
> > can access the window on rank 1 before rank 2 or others
> > call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
> > I think this is the intent of the sentence in the MPI standard
> > cited above.
> >
> > Thanks,
> > Takahiro Kawashima
> >
> >> Hi Rolf,
> >>
> >> yes, same issue ...
> >>
> >> i attached a patch to the github issue ( the issue might be in the test).
> >>
> >>   From the standards (11.5 Synchronization Calls) :
> >> "TheMPI_WIN_FENCE collective synchronization call supports a simple
> >> synchroniza-
> >> tion pattern that is often used in parallel computations: namely a
> >> loosely-synchronous
> >> model, where global computation phases alternate with global
> >> communication phases."
> >>
> >> as far as i understand (disclaimer, i am *not* good at reading standards
> >> ...) this is not
> >> necessarily an MPI_Barrier, so there is a race condition in the test
> >> case that can be avoided
> >> by adding an MPI_Barrier after initializing RecvBuff.
> >>
> >> could someone (Jeff ? George ?) please double check this before i push a
> >> fix into ompi-tests repo ?
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:
> >>> Hi Gilles:
> >>>
> >>> Is your failure similar to this ticket?
> >>>
> >>> https://github.com/open-mpi/ompi/issues/393
> >>>
> >>> Rolf
> >>>
> >>> *From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles
> >>> Gouaillardet
> >>> *Sent:* Monday, April 20, 2015 9:12 AM
> >>> *To:* Open MPI Developers
> >>> *Subject:* [OMPI devel] c_accumulate
> >>>
> >>> Folks,
> >>>
> >>> i (sometimes) get some failure with the c_accumulate test from the ibm
> >>> test suite on one host with 4 mpi tasks
> >>>
> >>> so far, i was only able to observe this on linux/sparc with the vader btl
> >>>
> >>> here is a snippet of the test :
> >>>
> >>> MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,
> >>>   MPI_COMM_WORLD, );
> >>>
> >>> SendBuff = rank + 100;
> >>> RecvBuff = 0;
> >>>
> >>> /* Accumulate to everyone, just for the heck of it */
> >>>
> >>> MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);
> >>> for (i = 0; i < size; ++i)
> >>>   MPI_Accumulate(, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, 
> >>> Win);
> >>> MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);
> >>>
> >>> when the test fails, RecvBuff in (rank+100) instead of the accumulated
> >>> value (100 * nprocs + (nprocs -1)*nprocs/2
> >>>
> >>> i am not familiar with onesided operations nor MPI_Win_fence.
> >>>
> >>> that being said, i found suspicious RecvBuff is initialized *after*
> >>> MPI_Win_create ...
> >>>
> >>> does MPI_Win_fence implies MPI_Barrier ?
> >>>
> >>> if not, i guess RecvBuff should be initialized *before* MPI_Win_create.
> >>>
> >>> makes sense ?
> >>>
> >>> (and if it does make sense, then this issue is not related to sparc,
> >>> and vader is not the root cause)


Re: [OMPI devel] binding output error

2015-04-20 Thread tmishima
Hi Devendar,

As far as I know, the report-bindings option shows the logical
cpu order. On the other hand, you are talking about physical one,
I guess.

Regards,
Tetsuya Mishima

2015/04/21 9:04:37、"devel"さんは「Re: [OMPI devel] binding output
error」で書きました
> HT is not enabled.  All node are same topo . This is reproducible even on
single node.
>
>
>
> I ran osu latency to see if it is really is mapped to other socket or not
with –map-by socket.  It looks likes mapping is correct as per latency
test.
>
>
>
> $mpirun -np 2 -report-bindings -map-by
socket  
/hpc/local/benchmarks/hpc-stack-icc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4.1/osu_latency

>
> [clx-orion-001:10084] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././././././././././.][./././././././././././././.]
>
> [clx-orion-001:10084] MCW rank 1 bound to socket 1[core 14[hwt 0]]:
[./././././././././././././.][B/././././././././././././.]
>
> # OSU MPI Latency Test v4.4.1
>
> # Size  Latency (us)
>
> 0   0.50
>
> 1   0.50
>
> 2   0.50
>
> 4   0.49
>
>
>
>
>
> $mpirun -np 2 -report-bindings -cpu-set
1,7 
/hpc/local/benchmarks/hpc-stack-icc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4.1/osu_latency

>
> [clx-orion-001:10155] MCW rank 0 bound to socket 0[core 1[hwt 0]]:
[./B/./././././././././././.][./././././././././././././.]
>
> [clx-orion-001:10155] MCW rank 1 bound to socket 0[core 7[hwt 0]]:
[./././././././B/./././././.][./././././././././././././.]
>
> # OSU MPI Latency Test v4.4.1
>
> # Size  Latency (us)
>
> 0   0.23
>
> 1   0.24
>
> 2   0.23
>
> 4   0.22
>
> 8   0.23
>
>
>
> Both hwloc and /proc/cpuinfo indicates following cpu numbering
>
> socket 0 cpus: 0 1 2 3 4 5 6 14 15 16 17 18 19 20
>
> socket 1 cpus: 7 8 9 10 11 12 13 21 22 23 24 25 26 27
>
>
>
> $hwloc-info -f
>
> Machine (256GB)
>
>   NUMANode L#0 (P#0 128GB) + Socket L#0 + L3 L#0 (35MB)
>
>     L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>
>     L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>
>     L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>
>     L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>
>     L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#4)
>
>     L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
>
>     L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#6)
>
>     L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#14)
>
>     L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8 + PU L#8 (P#15)
>
>     L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9 + PU L#9 (P#16)
>
>     L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10 + PU L#10 (P#17)
>
>     L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11 + PU L#11 (P#18)
>
>     L2 L#12 (256KB) + L1 L#12 (32KB) + Core L#12 + PU L#12 (P#19)
>
>     L2 L#13 (256KB) + L1 L#13 (32KB) + Core L#13 + PU L#13 (P#20)
>
>   NUMANode L#1 (P#1 128GB) + Socket L#1 + L3 L#1 (35MB)
>
>     L2 L#14 (256KB) + L1 L#14 (32KB) + Core L#14 + PU L#14 (P#7)
>
>     L2 L#15 (256KB) + L1 L#15 (32KB) + Core L#15 + PU L#15 (P#8)
>
>     L2 L#16 (256KB) + L1 L#16 (32KB) + Core L#16 + PU L#16 (P#9)
>
>     L2 L#17 (256KB) + L1 L#17 (32KB) + Core L#17 + PU L#17 (P#10)
>
>     L2 L#18 (256KB) + L1 L#18 (32KB) + Core L#18 + PU L#18 (P#11)
>
>     L2 L#19 (256KB) + L1 L#19 (32KB) + Core L#19 + PU L#19 (P#12)
>
>     L2 L#20 (256KB) + L1 L#20 (32KB) + Core L#20 + PU L#20 (P#13)
>
>     L2 L#21 (256KB) + L1 L#21 (32KB) + Core L#21 + PU L#21 (P#21)
>
>     L2 L#22 (256KB) + L1 L#22 (32KB) + Core L#22 + PU L#22 (P#22)
>
>     L2 L#23 (256KB) + L1 L#23 (32KB) + Core L#23 + PU L#23 (P#23)
>
>     L2 L#24 (256KB) + L1 L#24 (32KB) + Core L#24 + PU L#24 (P#24)
>
>     L2 L#25 (256KB) + L1 L#25 (32KB) + Core L#25 + PU L#25 (P#25)
>
>     L2 L#26 (256KB) + L1 L#26 (32KB) + Core L#26 + PU L#26 (P#26)
>
>     L2 L#27 (256KB) + L1 L#27 (32KB) + Core L#27 + PU L#27 (P#27)
>
>
>
>
>
> So, Is --reporting-binding shows one more level of logical CPU numbering?
>
>
>
>
>
> -Devendar
>
>
>
>
>
> From:devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain
> Sent: Monday, April 20, 2015 3:52 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] binding output error
>
>
>
> Also, was this with HT's enabled? I'm wondering if the print code is
incorrectly computing the core because it isn't correctly accounting for HT
cpus.
>
>
>
>
>
> On Mon, Apr 20, 2015 at 3:49 PM, Jeff Squyres (jsquyres)
 wrote:
>
> Ralph's the authority on this one, but just to be sure: are all nodes the
same topology? E.g., does adding "--hetero-nodes" to the mpirun command
line fix the problem?
>
>
>
> > On Apr 20, 2015, at 9:29 AM, Elena Elkina 
wrote:
> >
> > Hi guys,
> >
> > I faced with an issue on our cluster related to mapping & binding
policies on 1.8.5.
> >
> > The matter is that 

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet

Kawashima-san,

Nathan reached the same conclusion (see the github issue) and i fixed 
the test

by manually adding a MPI_Barrier.

Cheers,

Gilles

On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:

Hi Gilles, Nathan,

I read the MPI standard but I think the standard doesn't
require a barrier in the test program.

>From the standards (11.5.1 Fence) :

 A fence call usually entails a barrier synchronization:
   a process completes a call to MPI_WIN_FENCE only after all
   other processes in the group entered their matching call.
   However, a call to MPI_WIN_FENCE that is known not to end
   any epoch (in particular, a call with assert equal to
   MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.

This sentence is misleading.

In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
in the MPI implementation to end access/exposure epochs.

In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
in the MPI implementation to end access/exposure epochs.
Also, a *global* barrier is not necessary in the MPI
implementation to start access/exposure epochs. But some
synchronizations are still needed to start an exposure epoch.

For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
can access the window on rank 1 before rank 2 or others
call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
I think this is the intent of the sentence in the MPI standard
cited above.

Thanks,
Takahiro Kawashima


Hi Rolf,

yes, same issue ...

i attached a patch to the github issue ( the issue might be in the test).

  From the standards (11.5 Synchronization Calls) :
"TheMPI_WIN_FENCE collective synchronization call supports a simple
synchroniza-
tion pattern that is often used in parallel computations: namely a
loosely-synchronous
model, where global computation phases alternate with global
communication phases."

as far as i understand (disclaimer, i am *not* good at reading standards
...) this is not
necessarily an MPI_Barrier, so there is a race condition in the test
case that can be avoided
by adding an MPI_Barrier after initializing RecvBuff.

could someone (Jeff ? George ?) please double check this before i push a
fix into ompi-tests repo ?

Cheers,

Gilles

On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:

Hi Gilles:

Is your failure similar to this ticket?

https://github.com/open-mpi/ompi/issues/393

Rolf

*From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles
Gouaillardet
*Sent:* Monday, April 20, 2015 9:12 AM
*To:* Open MPI Developers
*Subject:* [OMPI devel] c_accumulate

Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm
test suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,
  MPI_COMM_WORLD, );
   
SendBuff = rank + 100;

RecvBuff = 0;
   
/* Accumulate to everyone, just for the heck of it */
   
MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);

for (i = 0; i < size; ++i)
  MPI_Accumulate(, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);

when the test fails, RecvBuff in (rank+100) instead of the accumulated
value (100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.

that being said, i found suspicious RecvBuff is initialized *after*
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc,
and vader is not the root cause)

Cheers,

Gilles

___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/04/17289.php






Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Hi Gilles, Nathan,

I read the MPI standard but I think the standard doesn't
require a barrier in the test program.

>From the standards (11.5.1 Fence) :

A fence call usually entails a barrier synchronization:
  a process completes a call to MPI_WIN_FENCE only after all
  other processes in the group entered their matching call.
  However, a call to MPI_WIN_FENCE that is known not to end
  any epoch (in particular, a call with assert equal to
  MPI_MODE_NOPRECEDE) does not necessarily act as a barrier.

This sentence is misleading.

In the non-MPI_MODE_NOPRECEDE case, a barrier is necessary
in the MPI implementation to end access/exposure epochs.

In the MPI_MODE_NOPRECEDE case, a barrier is not necessary
in the MPI implementation to end access/exposure epochs.
Also, a *global* barrier is not necessary in the MPI
implementation to start access/exposure epochs. But some
synchronizations are still needed to start an exposure epoch.

For example, let's assume all ranks call MPI_WIN_FENCE(MPI_MODE_NOPRECEDE)
and then rank 0 calls MPI_PUT to rank 1. In this case, rank 0
can access the window on rank 1 before rank 2 or others
call MPI_WIN_FENCE. (But rank 0 must wait rank 1's MPI_WIN_FENCE.)
I think this is the intent of the sentence in the MPI standard
cited above.

Thanks,
Takahiro Kawashima

> Hi Rolf,
> 
> yes, same issue ...
> 
> i attached a patch to the github issue ( the issue might be in the test).
> 
>  From the standards (11.5 Synchronization Calls) :
> "TheMPI_WIN_FENCE collective synchronization call supports a simple 
> synchroniza-
> tion pattern that is often used in parallel computations: namely a 
> loosely-synchronous
> model, where global computation phases alternate with global 
> communication phases."
> 
> as far as i understand (disclaimer, i am *not* good at reading standards 
> ...) this is not
> necessarily an MPI_Barrier, so there is a race condition in the test 
> case that can be avoided
> by adding an MPI_Barrier after initializing RecvBuff.
> 
> could someone (Jeff ? George ?) please double check this before i push a 
> fix into ompi-tests repo ?
> 
> Cheers,
> 
> Gilles
> 
> On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:
> >
> > Hi Gilles:
> >
> > Is your failure similar to this ticket?
> >
> > https://github.com/open-mpi/ompi/issues/393
> >
> > Rolf
> >
> > *From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles 
> > Gouaillardet
> > *Sent:* Monday, April 20, 2015 9:12 AM
> > *To:* Open MPI Developers
> > *Subject:* [OMPI devel] c_accumulate
> >
> > Folks,
> >
> > i (sometimes) get some failure with the c_accumulate test from the ibm 
> > test suite on one host with 4 mpi tasks
> >
> > so far, i was only able to observe this on linux/sparc with the vader btl
> >
> > here is a snippet of the test :
> >
> > MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,
> >  MPI_COMM_WORLD, );
> >   
> >SendBuff = rank + 100;
> >RecvBuff = 0;
> >   
> >/* Accumulate to everyone, just for the heck of it */
> >   
> >MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);
> >for (i = 0; i < size; ++i)
> >  MPI_Accumulate(, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
> >MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);
> >
> > when the test fails, RecvBuff in (rank+100) instead of the accumulated 
> > value (100 * nprocs + (nprocs -1)*nprocs/2
> >
> > i am not familiar with onesided operations nor MPI_Win_fence.
> >
> > that being said, i found suspicious RecvBuff is initialized *after* 
> > MPI_Win_create ...
> >
> > does MPI_Win_fence implies MPI_Barrier ?
> >
> > if not, i guess RecvBuff should be initialized *before* MPI_Win_create.
> >
> > makes sense ?
> >
> > (and if it does make sense, then this issue is not related to sparc, 
> > and vader is not the root cause)
> >
> > Cheers,
> >
> > Gilles


[hwloc-devel] Create success (hwloc git 1.10.1-24-g684fdcd)

2015-04-20 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc 1.10.1-24-g684fdcd
Start time: Mon Apr 20 21:03:04 EDT 2015
End time:   Mon Apr 20 21:04:29 EDT 2015

Your friendly daemon,
Cyrador


Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet

Hi Rolf,

yes, same issue ...

i attached a patch to the github issue ( the issue might be in the test).

From the standards (11.5 Synchronization Calls) :
"TheMPI_WIN_FENCE collective synchronization call supports a simple 
synchroniza-
tion pattern that is often used in parallel computations: namely a 
loosely-synchronous
model, where global computation phases alternate with global 
communication phases."


as far as i understand (disclaimer, i am *not* good at reading standards 
...) this is not
necessarily an MPI_Barrier, so there is a race condition in the test 
case that can be avoided

by adding an MPI_Barrier after initializing RecvBuff.

could someone (Jeff ? George ?) please double check this before i push a 
fix into ompi-tests repo ?


Cheers,

Gilles

On 4/20/2015 10:19 PM, Rolf vandeVaart wrote:


Hi Gilles:

Is your failure similar to this ticket?

https://github.com/open-mpi/ompi/issues/393

Rolf

*From:*devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Gilles 
Gouaillardet

*Sent:* Monday, April 20, 2015 9:12 AM
*To:* Open MPI Developers
*Subject:* [OMPI devel] c_accumulate

Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm 
test suite on one host with 4 mpi tasks


so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,
 MPI_COMM_WORLD, );
  
   SendBuff = rank + 100;

   RecvBuff = 0;
  
   /* Accumulate to everyone, just for the heck of it */
  
   MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);

   for (i = 0; i < size; ++i)
 MPI_Accumulate(, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
   MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);

when the test fails, RecvBuff in (rank+100) instead of the accumulated 
value (100 * nprocs + (nprocs -1)*nprocs/2


i am not familiar with onesided operations nor MPI_Win_fence.

that being said, i found suspicious RecvBuff is initialized *after* 
MPI_Win_create ...


does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc, 
and vader is not the root cause)


Cheers,

Gilles


This email message is for the sole use of the intended recipient(s) 
and may contain confidential information.  Any unauthorized review, 
use, disclosure or distribution is prohibited.  If you are not the 
intended recipient, please contact the sender by reply email and 
destroy all copies of the original message.




___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/04/17272.php




Re: [OMPI devel] binding output error

2015-04-20 Thread Devendar Bureddy
HT is not enabled.  All node are same topo . This is reproducible even on 
single node.

I ran osu latency to see if it is really is mapped to other socket or not with 
–map-by socket.  It looks likes mapping is correct as per latency test.

$mpirun -np 2 -report-bindings -map-by socket  
/hpc/local/benchmarks/hpc-stack-icc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4.1/osu_latency
[clx-orion-001:10084] MCW rank 0 bound to socket 0[core 0[hwt 0]]: 
[B/././././././././././././.][./././././././././././././.]
[clx-orion-001:10084] MCW rank 1 bound to socket 1[core 14[hwt 0]]: 
[./././././././././././././.][B/././././././././././././.]
# OSU MPI Latency Test v4.4.1
# Size  Latency (us)
0   0.50
1   0.50
2   0.50
4   0.49


$mpirun -np 2 -report-bindings -cpu-set 1,7 
/hpc/local/benchmarks/hpc-stack-icc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4.1/osu_latency
[clx-orion-001:10155] MCW rank 0 bound to socket 0[core 1[hwt 0]]: 
[./B/./././././././././././.][./././././././././././././.]
[clx-orion-001:10155] MCW rank 1 bound to socket 0[core 7[hwt 0]]: 
[./././././././B/./././././.][./././././././././././././.]
# OSU MPI Latency Test v4.4.1
# Size  Latency (us)
0   0.23
1   0.24
2   0.23
4   0.22
8   0.23

Both hwloc and /proc/cpuinfo indicates following cpu numbering
socket 0 cpus: 0 1 2 3 4 5 6 14 15 16 17 18 19 20
socket 1 cpus: 7 8 9 10 11 12 13 21 22 23 24 25 26 27

$hwloc-info -f
Machine (256GB)
  NUMANode L#0 (P#0 128GB) + Socket L#0 + L3 L#0 (35MB)
L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#1)
L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#3)
L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#4)
L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#6)
L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#14)
L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8 + PU L#8 (P#15)
L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9 + PU L#9 (P#16)
L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10 + PU L#10 (P#17)
L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11 + PU L#11 (P#18)
L2 L#12 (256KB) + L1 L#12 (32KB) + Core L#12 + PU L#12 (P#19)
L2 L#13 (256KB) + L1 L#13 (32KB) + Core L#13 + PU L#13 (P#20)
  NUMANode L#1 (P#1 128GB) + Socket L#1 + L3 L#1 (35MB)
L2 L#14 (256KB) + L1 L#14 (32KB) + Core L#14 + PU L#14 (P#7)
L2 L#15 (256KB) + L1 L#15 (32KB) + Core L#15 + PU L#15 (P#8)
L2 L#16 (256KB) + L1 L#16 (32KB) + Core L#16 + PU L#16 (P#9)
L2 L#17 (256KB) + L1 L#17 (32KB) + Core L#17 + PU L#17 (P#10)
L2 L#18 (256KB) + L1 L#18 (32KB) + Core L#18 + PU L#18 (P#11)
L2 L#19 (256KB) + L1 L#19 (32KB) + Core L#19 + PU L#19 (P#12)
L2 L#20 (256KB) + L1 L#20 (32KB) + Core L#20 + PU L#20 (P#13)
L2 L#21 (256KB) + L1 L#21 (32KB) + Core L#21 + PU L#21 (P#21)
L2 L#22 (256KB) + L1 L#22 (32KB) + Core L#22 + PU L#22 (P#22)
L2 L#23 (256KB) + L1 L#23 (32KB) + Core L#23 + PU L#23 (P#23)
L2 L#24 (256KB) + L1 L#24 (32KB) + Core L#24 + PU L#24 (P#24)
L2 L#25 (256KB) + L1 L#25 (32KB) + Core L#25 + PU L#25 (P#25)
L2 L#26 (256KB) + L1 L#26 (32KB) + Core L#26 + PU L#26 (P#26)
L2 L#27 (256KB) + L1 L#27 (32KB) + Core L#27 + PU L#27 (P#27)


So, Is --reporting-binding shows one more level of logical CPU numbering?


-Devendar


From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Monday, April 20, 2015 3:52 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] binding output error

Also, was this with HT's enabled? I'm wondering if the print code is 
incorrectly computing the core because it isn't correctly accounting for HT 
cpus.


On Mon, Apr 20, 2015 at 3:49 PM, Jeff Squyres (jsquyres) 
> wrote:
Ralph's the authority on this one, but just to be sure: are all nodes the same 
topology? E.g., does adding "--hetero-nodes" to the mpirun command line fix the 
problem?


> On Apr 20, 2015, at 9:29 AM, Elena Elkina 
> > wrote:
>
> Hi guys,
>
> I faced with an issue on our cluster related to mapping & binding policies on 
> 1.8.5.
>
> The matter is that --report-bindings output doesn't correspond to the locale. 
> It looks like there is a mistake on the output itself, because it just puts 
> serial core number while that core can be on another socket. For example,
>
> mpirun -np 2 --display-devel-map --report-bindings --map-by socket hostname
>  Data for JOB [43064,1] offset 0
>
>  Mapper requested: NULL  Last mapper: round_robin  Mapping policy: BYSOCKET  
> Ranking policy: SOCKET
>  Binding policy: CORE  Cpu set: NULL  PPR: NULL  

Re: [OMPI devel] noticing odd message

2015-04-20 Thread Jeff Squyres (jsquyres)
Thanks!

> On Apr 20, 2015, at 6:52 PM, Nathan Hjelm  wrote:
> 
> 
> Fixed in 359a282e7d31a8a7af3a69ead518ff328862b801. mca_base_var does not
> currently allow component to be registered with NULL for both the
> framework and component.
> 
> -Nathan
> 
> On Mon, Apr 20, 2015 at 04:34:10PM -0600, Howard Pritchard wrote:
>>   Hi Folks,
>>   Working on master, I"m getting an odd message:
>>   malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c,
>>   170)
>>   whenever I launch a job.
>>   It looks like this can be traced back to this line in
>>   orte_ess_singleton_component_register:
>>   mca_base_var_register_synonym(ret "orte", NULL, NULL, "server", 0);
>>   this just recently started appearing, perhaps today, but I've not been
>>   running
>>   anything over the weekend.
>>   Howard
> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/04/17279.php
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/04/17284.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] noticing odd message

2015-04-20 Thread Ralph Castain
I confirmed it is cleaned up for me - thanks Nathan!


On Mon, Apr 20, 2015 at 3:52 PM, Nathan Hjelm  wrote:

>
> Fixed in 359a282e7d31a8a7af3a69ead518ff328862b801. mca_base_var does not
> currently allow component to be registered with NULL for both the
> framework and component.
>
> -Nathan
>
> On Mon, Apr 20, 2015 at 04:34:10PM -0600, Howard Pritchard wrote:
> >Hi Folks,
> >Working on master, I"m getting an odd message:
> >malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c,
> >170)
> >whenever I launch a job.
> >It looks like this can be traced back to this line in
> >orte_ess_singleton_component_register:
> >mca_base_var_register_synonym(ret "orte", NULL, NULL, "server", 0);
> >this just recently started appearing, perhaps today, but I've not been
> >running
> >anything over the weekend.
> >Howard
>
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/04/17279.php
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/04/17284.php
>


Re: [OMPI devel] noticing odd message

2015-04-20 Thread Nathan Hjelm

Fixed in 359a282e7d31a8a7af3a69ead518ff328862b801. mca_base_var does not
currently allow component to be registered with NULL for both the
framework and component.

-Nathan

On Mon, Apr 20, 2015 at 04:34:10PM -0600, Howard Pritchard wrote:
>Hi Folks,
>Working on master, I"m getting an odd message:
>malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c,
>170)
>whenever I launch a job.
>It looks like this can be traced back to this line in
>orte_ess_singleton_component_register:
>mca_base_var_register_synonym(ret "orte", NULL, NULL, "server", 0);
>this just recently started appearing, perhaps today, but I've not been
>running
>anything over the weekend.
>Howard

> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/04/17279.php



pgpIasUblw2Ro.pgp
Description: PGP signature


Re: [OMPI devel] binding output error

2015-04-20 Thread Ralph Castain
Also, was this with HT's enabled? I'm wondering if the print code is
incorrectly computing the core because it isn't correctly accounting for HT
cpus.


On Mon, Apr 20, 2015 at 3:49 PM, Jeff Squyres (jsquyres)  wrote:

> Ralph's the authority on this one, but just to be sure: are all nodes the
> same topology? E.g., does adding "--hetero-nodes" to the mpirun command
> line fix the problem?
>
>
> > On Apr 20, 2015, at 9:29 AM, Elena Elkina 
> wrote:
> >
> > Hi guys,
> >
> > I faced with an issue on our cluster related to mapping & binding
> policies on 1.8.5.
> >
> > The matter is that --report-bindings output doesn't correspond to the
> locale. It looks like there is a mistake on the output itself, because it
> just puts serial core number while that core can be on another socket. For
> example,
> >
> > mpirun -np 2 --display-devel-map --report-bindings --map-by socket
> hostname
> >  Data for JOB [43064,1] offset 0
> >
> >  Mapper requested: NULL  Last mapper: round_robin  Mapping policy:
> BYSOCKET  Ranking policy: SOCKET
> >  Binding policy: CORE  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
> >   Num new daemons: 0  New daemon starting vpid INVALID
> >   Num nodes: 1
> >
> >  Data for node: clx-orion-001 Launch id: -1   State: 2
> >   Daemon: [[43064,0],0]   Daemon launched: True
> >   Num slots: 28   Slots in use: 2 Oversubscribed: FALSE
> >   Num slots allocated: 28 Max slots: 0
> >   Username on node: NULL
> >   Num procs: 2Next node_rank: 2
> >   Data for proc: [[43064,1],0]
> >   Pid: 0  Local rank: 0   Node rank: 0App rank: 0
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 0-6,14-20   Bind location: 0Binding: 0
> >   Data for proc: [[43064,1],1]
> >   Pid: 0  Local rank: 1   Node rank: 1App rank: 1
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 7-13,21-27  Bind location: 7Binding: 7
> > [clx-orion-001:26951] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> [B/././././././././././././.][./././././././././././././.]
> > [clx-orion-001:26951] MCW rank 1 bound to socket 1[core 14[hwt 0]]:
> [./././././././././././././.][B/././././././././././././.]
> >
> > The second process should be bound at core 7 (not core 14).
> >
> >
> > Another example:
> > mpirun -np 8 --display-devel-map --report-bindings --map-by core hostname
> >  Data for JOB [43202,1] offset 0
> >
> >  Mapper requested: NULL  Last mapper: round_robin  Mapping policy:
> BYCORE  Ranking policy: CORE
> >  Binding policy: CORE  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
> >   Num new daemons: 0  New daemon starting vpid INVALID
> >   Num nodes: 1
> >
> >  Data for node: clx-orion-001 Launch id: -1   State: 2
> >   Daemon: [[43202,0],0]   Daemon launched: True
> >   Num slots: 28   Slots in use: 8 Oversubscribed: FALSE
> >   Num slots allocated: 28 Max slots: 0
> >   Username on node: NULL
> >   Num procs: 8Next node_rank: 8
> >   Data for proc: [[43202,1],0]
> >   Pid: 0  Local rank: 0   Node rank: 0App rank: 0
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 0   Bind location: 0Binding: 0
> >   Data for proc: [[43202,1],1]
> >   Pid: 0  Local rank: 1   Node rank: 1App rank: 1
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 1   Bind location: 1Binding: 1
> >   Data for proc: [[43202,1],2]
> >   Pid: 0  Local rank: 2   Node rank: 2App rank: 2
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 2   Bind location: 2Binding: 2
> >   Data for proc: [[43202,1],3]
> >   Pid: 0  Local rank: 3   Node rank: 3App rank: 3
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 3   Bind location: 3Binding: 3
> >   Data for proc: [[43202,1],4]
> >   Pid: 0  Local rank: 4   Node rank: 4App rank: 4
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 4   Bind location: 4Binding: 4
> >   Data for proc: [[43202,1],5]
> >   Pid: 0  Local rank: 5   Node rank: 5App rank: 5
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 5   Bind location: 5Binding: 5
> >   Data for proc: [[43202,1],6]
> >   Pid: 0  Local rank: 6   Node rank: 6App rank: 6
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 6   Bind location: 6Binding: 6
> >   Data for proc: [[43202,1],7]
> >   Pid: 0  Local rank: 7   Node rank: 7App rank: 7
> >   State: INITIALIZED  Restarts: 0 App_context: 0
> Locale: 14  Bind location: 14   Binding: 14
> > 

Re: [OMPI devel] binding output error

2015-04-20 Thread Jeff Squyres (jsquyres)
Ralph's the authority on this one, but just to be sure: are all nodes the same 
topology? E.g., does adding "--hetero-nodes" to the mpirun command line fix the 
problem?


> On Apr 20, 2015, at 9:29 AM, Elena Elkina  wrote:
> 
> Hi guys,
> 
> I faced with an issue on our cluster related to mapping & binding policies on 
> 1.8.5.
> 
> The matter is that --report-bindings output doesn't correspond to the locale. 
> It looks like there is a mistake on the output itself, because it just puts 
> serial core number while that core can be on another socket. For example,
> 
> mpirun -np 2 --display-devel-map --report-bindings --map-by socket hostname
>  Data for JOB [43064,1] offset 0
> 
>  Mapper requested: NULL  Last mapper: round_robin  Mapping policy: BYSOCKET  
> Ranking policy: SOCKET
>  Binding policy: CORE  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
>   Num new daemons: 0  New daemon starting vpid INVALID
>   Num nodes: 1
> 
>  Data for node: clx-orion-001 Launch id: -1   State: 2
>   Daemon: [[43064,0],0]   Daemon launched: True
>   Num slots: 28   Slots in use: 2 Oversubscribed: FALSE
>   Num slots allocated: 28 Max slots: 0
>   Username on node: NULL
>   Num procs: 2Next node_rank: 2
>   Data for proc: [[43064,1],0]
>   Pid: 0  Local rank: 0   Node rank: 0App rank: 0
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 0-6,14-20   Bind location: 0Binding: 0
>   Data for proc: [[43064,1],1]
>   Pid: 0  Local rank: 1   Node rank: 1App rank: 1
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 7-13,21-27  Bind location: 7Binding: 7
> [clx-orion-001:26951] MCW rank 0 bound to socket 0[core 0[hwt 0]]: 
> [B/././././././././././././.][./././././././././././././.]
> [clx-orion-001:26951] MCW rank 1 bound to socket 1[core 14[hwt 0]]: 
> [./././././././././././././.][B/././././././././././././.]
> 
> The second process should be bound at core 7 (not core 14).
> 
> 
> Another example:
> mpirun -np 8 --display-devel-map --report-bindings --map-by core hostname
>  Data for JOB [43202,1] offset 0
> 
>  Mapper requested: NULL  Last mapper: round_robin  Mapping policy: BYCORE  
> Ranking policy: CORE
>  Binding policy: CORE  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
>   Num new daemons: 0  New daemon starting vpid INVALID
>   Num nodes: 1
> 
>  Data for node: clx-orion-001 Launch id: -1   State: 2
>   Daemon: [[43202,0],0]   Daemon launched: True
>   Num slots: 28   Slots in use: 8 Oversubscribed: FALSE
>   Num slots allocated: 28 Max slots: 0
>   Username on node: NULL
>   Num procs: 8Next node_rank: 8
>   Data for proc: [[43202,1],0]
>   Pid: 0  Local rank: 0   Node rank: 0App rank: 0
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 0   Bind location: 0Binding: 0
>   Data for proc: [[43202,1],1]
>   Pid: 0  Local rank: 1   Node rank: 1App rank: 1
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 1   Bind location: 1Binding: 1
>   Data for proc: [[43202,1],2]
>   Pid: 0  Local rank: 2   Node rank: 2App rank: 2
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 2   Bind location: 2Binding: 2
>   Data for proc: [[43202,1],3]
>   Pid: 0  Local rank: 3   Node rank: 3App rank: 3
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 3   Bind location: 3Binding: 3
>   Data for proc: [[43202,1],4]
>   Pid: 0  Local rank: 4   Node rank: 4App rank: 4
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 4   Bind location: 4Binding: 4
>   Data for proc: [[43202,1],5]
>   Pid: 0  Local rank: 5   Node rank: 5App rank: 5
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 5   Bind location: 5Binding: 5
>   Data for proc: [[43202,1],6]
>   Pid: 0  Local rank: 6   Node rank: 6App rank: 6
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 6   Bind location: 6Binding: 6
>   Data for proc: [[43202,1],7]
>   Pid: 0  Local rank: 7   Node rank: 7App rank: 7
>   State: INITIALIZED  Restarts: 0 App_context: 0  Locale: 
> 14  Bind location: 14   Binding: 14
> [clx-orion-001:27069] MCW rank 0 bound to socket 0[core 0[hwt 0]]: 
> [B/././././././././././././.][./././././././././././././.]
> [clx-orion-001:27069] MCW rank 1 bound to socket 0[core 1[hwt 0]]: 
> [./B/./././././././././././.][./././././././././././././.]
> [clx-orion-001:27069] MCW rank 2 bound to socket 0[core 2[hwt 0]]: 
> 

Re: [OMPI devel] noticing odd message

2015-04-20 Thread Nathan Hjelm

Tracking it down now. Probably a typo in a component initialization.

-Nathan

On Mon, Apr 20, 2015 at 04:34:10PM -0600, Howard Pritchard wrote:
>Hi Folks,
>Working on master, I"m getting an odd message:
>malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c,
>170)
>whenever I launch a job.
>It looks like this can be traced back to this line in
>orte_ess_singleton_component_register:
>mca_base_var_register_synonym(ret "orte", NULL, NULL, "server", 0);
>this just recently started appearing, perhaps today, but I've not been
>running
>anything over the weekend.
>Howard

> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/04/17279.php



pgpDr2WxLxHr2.pgp
Description: PGP signature


Re: [OMPI devel] noticing odd message

2015-04-20 Thread Jeff Squyres (jsquyres)
+1 -- I saw this today/over the weekend.  I didn't bisect to see where it 
started; I assume it was one of the MCA var base updates.



> On Apr 20, 2015, at 6:34 PM, Howard Pritchard  wrote:
> 
> Hi Folks,
> 
> Working on master, I"m getting an odd message:
> 
> malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c, 170)
> 
> whenever I launch a job.
> 
> It looks like this can be traced back to this line in 
> orte_ess_singleton_component_register:
> 
> mca_base_var_register_synonym(ret "orte", NULL, NULL, "server", 0);
> 
> this just recently started appearing, perhaps today, but I've not been running
> anything over the weekend.
> 
> Howard
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/04/17279.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI devel] noticing odd message

2015-04-20 Thread Howard Pritchard
Hi Folks,

Working on master, I"m getting an odd message:

malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c, 170)

whenever I launch a job.

It looks like this can be traced back to this line in
orte_ess_singleton_component_register:

mca_base_var_register_synonym(ret "orte", NULL, NULL, "server", 0);

this just recently started appearing, perhaps today, but I've not been
running
anything over the weekend.

Howard


Re: [OMPI devel] Master appears broken on the Mac

2015-04-20 Thread Nathan Hjelm

Shoot. That would be my configure changes. Looks like I should rename
that temporary variable or push/pop it. Will get you a fix soon.

-Nathan

On Mon, Apr 20, 2015 at 01:57:45PM -0700, Ralph Castain wrote:
>Hit this error with current HEAD:
> 
>checking if threads have different pids (pthreads on linux)... configure:
>WARNING: Found configure shell variable clash!
> 
>configure: WARNING: OPAL_VAR_SCOPE_PUSH called on "LDFLAGS_save",
> 
>configure: WARNING: but it is already defined with value "
>-Wl,-flat_namespace"
> 
>configure: WARNING: This usually indicates an error in configure.
> 
>configure: error: Cannot continue
> 
>Any ideas?
>Ralph

> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/04/17277.php



pgpneJDNtjmRx.pgp
Description: PGP signature


[OMPI devel] Master appears broken on the Mac

2015-04-20 Thread Ralph Castain
Hit this error with current HEAD:

checking if threads have different pids (pthreads on linux)... configure:
WARNING: Found configure shell variable clash!

configure: WARNING: OPAL_VAR_SCOPE_PUSH called on "LDFLAGS_save",

configure: WARNING: but it is already defined with value "
-Wl,-flat_namespace"

configure: WARNING: This usually indicates an error in configure.

configure: error: Cannot continue


Any ideas?
Ralph


Re: [OMPI devel] 1.8.5rc1 is ready for testing

2015-04-20 Thread Jeff Squyres (jsquyres)
Got it; I knew there was a reason -- I just couldn't remember what it was.

If you care, the problem was actually a bug in Libtool's libltdl embedding 
machinery.  We "fixed" the problem by not embedding libltdl by default any more 
(and went a different way...).  If you care:

https://github.com/open-mpi/ompi/issues/311 was the initial identification of 
the issue.

We tried a few different approaches to fix the problem.

https://github.com/open-mpi/ompi/pull/410 was the final solution.

(Ralph just recently PR'ed this over to the v1.8 branch)



> On Apr 20, 2015, at 12:35 PM, Marco Atzeri  wrote:
> 
> On 4/20/2015 5:16 PM, Jeff Squyres (jsquyres) wrote:
>> I looked at this thread in a little more detail...
>> 
>> The question below is a little moot because of the change that was done to 
>> v1.8, but please humor me anyway.  :-)
>> 
>> Macro: I think you told me before, but I forget, so please refresh my 
>> memory: I seem to recall that there's a reason you're invoking autogen in a 
>> tarball, but I don't remember what it is.
>> 
> 
> Hi Jeff,
> 
> It is a standard best practice for cygwin package build.
> Our package build system (cygport) has autoreconf as default
> before configure..
> 
> 90% of the time is not really needed, but some packages are really
> a pain. So to avoid surprise we play safe.
> 
> 
>> I ask because in all POSIX cases at least, end users should be able to just 
>> untar, configure, make, make install -- they don't/shouldn't run autogen).  
>> I.e., it doesn't matter what version of Libtool end users have installed (or 
>> not!) because we bootstrapped the tarball with a Libtool version that we 
>> know works.  Even more specifically: the error you're running in to should 
>> not have happened with a plain tarball -- the only cases where I can think 
>> of it happening would be if you got a git clone and ran autogen, or if you 
>> got a tarball and (re-)ran autogen.
> 
> It is "got a tarball and (re-)ran autogen"
> 
> I can disable it and test anyway, if it is really needed, but
> I will prefer that autogen.sh works as expected.
> 
> Regards
> Marco
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/04/17275.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.8.5rc1 is ready for testing

2015-04-20 Thread Marco Atzeri

On 4/20/2015 5:16 PM, Jeff Squyres (jsquyres) wrote:

I looked at this thread in a little more detail...

The question below is a little moot because of the change that was done to 
v1.8, but please humor me anyway.  :-)

Macro: I think you told me before, but I forget, so please refresh my memory: I 
seem to recall that there's a reason you're invoking autogen in a tarball, but 
I don't remember what it is.



Hi Jeff,

It is a standard best practice for cygwin package build.
Our package build system (cygport) has autoreconf as default
before configure..

90% of the time is not really needed, but some packages are really
a pain. So to avoid surprise we play safe.



I ask because in all POSIX cases at least, end users should be able to just 
untar, configure, make, make install -- they don't/shouldn't run autogen).  
I.e., it doesn't matter what version of Libtool end users have installed (or 
not!) because we bootstrapped the tarball with a Libtool version that we know 
works.  Even more specifically: the error you're running in to should not have 
happened with a plain tarball -- the only cases where I can think of it 
happening would be if you got a git clone and ran autogen, or if you got a 
tarball and (re-)ran autogen.


It is "got a tarball and (re-)ran autogen"

I can disable it and test anyway, if it is really needed, but
I will prefer that autogen.sh works as expected.

Regards
Marco



Re: [OMPI devel] 1.8.5rc1 is ready for testing

2015-04-20 Thread Jeff Squyres (jsquyres)
I looked at this thread in a little more detail...

The question below is a little moot because of the change that was done to 
v1.8, but please humor me anyway.  :-)

Macro: I think you told me before, but I forget, so please refresh my memory: I 
seem to recall that there's a reason you're invoking autogen in a tarball, but 
I don't remember what it is.  

I ask because in all POSIX cases at least, end users should be able to just 
untar, configure, make, make install -- they don't/shouldn't run autogen).  
I.e., it doesn't matter what version of Libtool end users have installed (or 
not!) because we bootstrapped the tarball with a Libtool version that we know 
works.  Even more specifically: the error you're running in to should not have 
happened with a plain tarball -- the only cases where I can think of it 
happening would be if you got a git clone and ran autogen, or if you got a 
tarball and (re-)ran autogen.



> On Apr 18, 2015, at 3:59 PM, Marco Atzeri  wrote:
> 
> tomorrow is fine .
> I am testing octave-4.0.0-rc3 today
> ;-)
> 
> 
> On 4/18/2015 9:13 PM, Ralph Castain wrote:
>> I am planning on rc2 on Monday, if you’d prefer to wait
>> 
>> 
>>> On Apr 18, 2015, at 9:30 AM, Marco Atzeri  wrote:
>>> 
>>> Are you planning another rc or I should test the git stable repository ?
>>> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/04/17270.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI devel] binding output error

2015-04-20 Thread Elena Elkina
Hi guys,

I faced with an issue on our cluster related to mapping & binding policies
on 1.8.5.

The matter is that --report-bindings output doesn't correspond to the
locale. It looks like there is a mistake on the output itself, because it
just puts serial core number while that core can be on another socket. For
example,

mpirun -np 2 --display-devel-map --report-bindings --map-by *socket*
hostname
 Data for JOB [43064,1] offset 0

 Mapper requested: NULL  Last mapper: round_robin  Mapping policy: BYSOCKET
 Ranking policy: SOCKET
 Binding policy: CORE  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
  Num new daemons: 0 New daemon starting vpid INVALID
  Num nodes: 1

 Data for node: clx-orion-001  Launch id: -1 State: 2
  Daemon: [[43064,0],0] Daemon launched: True
  Num slots: 28 Slots in use: 2 Oversubscribed: FALSE
  Num slots allocated: 28 Max slots: 0
  Username on node: NULL
  Num procs: 2 Next node_rank: 2
  Data for proc: [[43064,1],0]
  Pid: 0 Local rank: 0 Node rank: 0 App rank: 0
  State: INITIALIZED Restarts: 0 App_context: 0 *Locale: 0-6,14-20* Bind
location: 0 Binding: 0
  Data for proc: [[43064,1],1]
  Pid: 0 Local rank: 1 Node rank: 1 App rank: 1
  State: INITIALIZED Restarts: 0 App_context: 0 *Locale: 7-13,21-27* Bind
location: 7 Binding: 7
[clx-orion-001:26951] MCW rank 0 bound to socket 0[*core 0[*hwt 0]]:
[B/././././././././././././.][./././././././././././././.]
[clx-orion-001:26951] MCW rank 1 bound to socket 1[*core 14*[hwt 0]]:
[./././././././././././././.][B/././././././././././././.]

The second process should be bound at core 7 (not core 14).


Another example:
mpirun -np 8 --display-devel-map --report-bindings --map-by core hostname
 Data for JOB [43202,1] offset 0

 Mapper requested: NULL  Last mapper: round_robin  Mapping policy: BYCORE
 Ranking policy: CORE
 Binding policy: CORE  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
  Num new daemons: 0 New daemon starting vpid INVALID
  Num nodes: 1

 Data for node: clx-orion-001  Launch id: -1 State: 2
  Daemon: [[43202,0],0] Daemon launched: True
  Num slots: 28 Slots in use: 8 Oversubscribed: FALSE
  Num slots allocated: 28 Max slots: 0
  Username on node: NULL
  Num procs: 8 Next node_rank: 8
  Data for proc: [[43202,1],0]
  Pid: 0 Local rank: 0 Node rank: 0 App rank: 0
  State: INITIALIZED Restarts: 0 App_context: 0 Locale: 0 Bind
location: 0 Binding:
0
  Data for proc: [[43202,1],1]
  Pid: 0 Local rank: 1 Node rank: 1 App rank: 1
  State: INITIALIZED Restarts: 0 App_context: 0 Locale: 1 Bind
location: 1 Binding:
1
  Data for proc: [[43202,1],2]
  Pid: 0 Local rank: 2 Node rank: 2 App rank: 2
  State: INITIALIZED Restarts: 0 App_context: 0 Locale: 2 Bind
location: 2 Binding:
2
  Data for proc: [[43202,1],3]
  Pid: 0 Local rank: 3 Node rank: 3 App rank: 3
  State: INITIALIZED Restarts: 0 App_context: 0 Locale: 3 Bind
location: 3 Binding:
3
  Data for proc: [[43202,1],4]
  Pid: 0 Local rank: 4 Node rank: 4 App rank: 4
  State: INITIALIZED Restarts: 0 App_context: 0 Locale: 4 Bind
location: 4 Binding:
4
  Data for proc: [[43202,1],5]
  Pid: 0 Local rank: 5 Node rank: 5 App rank: 5
  State: INITIALIZED Restarts: 0 App_context: 0 Locale: 5 Bind
location: 5 Binding:
5
  Data for proc: [[43202,1],6]
  Pid: 0 Local rank: 6 Node rank: 6 App rank: 6
  State: INITIALIZED Restarts: 0 App_context: 0 Locale: 6 Bind
location: 6 Binding:
6
  Data for proc: [[43202,1],7]
  Pid: 0 Local rank: 7 Node rank: 7 App rank: 7
  State: INITIALIZED Restarts: 0 App_context: 0 *Locale: 14* Bind location:
14 Binding: 14
[clx-orion-001:27069] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././././././././././.][./././././././././././././.]
[clx-orion-001:27069] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././././././././././.][./././././././././././././.]
[clx-orion-001:27069] MCW rank 2 bound to socket 0[core 2[hwt 0]]:
[././B/././././././././././.][./././././././././././././.]
[clx-orion-001:27069] MCW rank 3 bound to socket 0[core 3[hwt 0]]:
[./././B/./././././././././.][./././././././././././././.]
[clx-orion-001:27069] MCW rank 4 bound to socket 0[core 4[hwt 0]]:
[././././B/././././././././.][./././././././././././././.]
[clx-orion-001:27069] MCW rank 5 bound to socket 0[core 5[hwt 0]]:
[./././././B/./././././././.][./././././././././././././.]
[clx-orion-001:27069] MCW rank 6 bound to socket 0[core 6[hwt 0]]:
[././././././B/././././././.][./././././././././././././.]
[clx-orion-001:27069] MCW rank 7 bound to socket 0[*core 7*[hwt 0]]:
[./././././././B/./././././.][./././././././././././././.]

Rank 7 should be bound at core 14 instead of core 7 since core 7 is at
another socket.

Best regards,
Elena


Re: [OMPI devel] c_accumulate

2015-04-20 Thread Rolf vandeVaart
Hi Gilles:
Is your failure similar to this ticket?
https://github.com/open-mpi/ompi/issues/393
Rolf

From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Gilles Gouaillardet
Sent: Monday, April 20, 2015 9:12 AM
To: Open MPI Developers
Subject: [OMPI devel] c_accumulate

Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm test 
suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :


MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,

MPI_COMM_WORLD, );



  SendBuff = rank + 100;

  RecvBuff = 0;



  /* Accumulate to everyone, just for the heck of it */



  MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);

  for (i = 0; i < size; ++i)

MPI_Accumulate(, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);

  MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);

when the test fails, RecvBuff in (rank+100) instead of the accumulated value 
(100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.
that being said, i found suspicious RecvBuff is initialized *after* 
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc, and vader 
is not the root cause)

Cheers,

Gilles

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


[OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet
Folks,

i (sometimes) get some failure with the c_accumulate test from the ibm test
suite on one host with 4 mpi tasks

so far, i was only able to observe this on linux/sparc with the vader btl

here is a snippet of the test :

MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,
  MPI_COMM_WORLD, );

  SendBuff = rank + 100;
  RecvBuff = 0;

  /* Accumulate to everyone, just for the heck of it */

  MPI_Win_fence(MPI_MODE_NOPRECEDE, Win);
  for (i = 0; i < size; ++i)
MPI_Accumulate(, 1, MPI_INT, i, 0, 1, MPI_INT, MPI_SUM, Win);
  MPI_Win_fence((MPI_MODE_NOPUT | MPI_MODE_NOSUCCEED), Win);


when the test fails, RecvBuff in (rank+100) instead of the accumulated
value (100 * nprocs + (nprocs -1)*nprocs/2

i am not familiar with onesided operations nor MPI_Win_fence.
that being said, i found suspicious RecvBuff is initialized *after*
MPI_Win_create ...

does MPI_Win_fence implies MPI_Barrier ?

if not, i guess RecvBuff should be initialized *before* MPI_Win_create.

makes sense ?

(and if it does make sense, then this issue is not related to sparc, and
vader is not the root cause)

Cheers,

Gilles