Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-18 Thread tmishima


Hi, here is the output with "-mca rmaps_base_verbose 10
-mca ess_base_verbose 5". Please see the attached file.

(See attached file: output.txt)

Regards,
Tetsuya Mishima

> Hmm...try adding "-mca rmaps_base_verbose 10 -mca ess_base_verbose 5" to
your cmd line and let's see what it thinks it found.
>
>
> On Dec 18, 2013, at 6:55 PM, tmish...@jcity.maeda.co.jp wrote:
>
> >
> >
> > Hi, I report one more problem with openmpi-1.7.4rc1,
> > which is more serious.
> >
> > For our 32 core nodes(AMD magny cours based) which has
> > 8 numa-nodes, "-bind-to numa" does not work. Without
> > this option, it works. For your infomation, at the
> > bottom of this mail, I added the lstopo information
> > of the node.
> >
> > Regards,
> > Tetsuya Mishima
> >
> > [mishima@manage ~]$ qsub -I -l nodes=1:ppn=32
> > qsub: waiting for job 8352.manage.cluster to start
> > qsub: job 8352.manage.cluster ready
> >
> > [mishima@node03 demos]$ mpirun -np 8 -report-bindings -bind-to numa
myprog
> > [node03.cluster:15316] [[37582,0],0] bind:upward target NUMANode type
> > Machine
> >
--
> > A request was made to bind to NUMA, but an appropriate target could not
> > be found on node node03.
> >
--
> > [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> > [mishima@node03 demos]$ mpirun -np 8 -report-bindings myprog
> > [node03.cluster:15282] MCW rank 2 bound to socket 1[core 8[hwt 0]]:
> > [./././././././.][B/././././././.][./././././././.][
> > ./././././././.]
> > [node03.cluster:15282] MCW rank 3 bound to socket 1[core 9[hwt 0]]:
> > [./././././././.][./B/./././././.][./././././././.][
> > ./././././././.]
> > [node03.cluster:15282] MCW rank 4 bound to socket 2[core 16[hwt 0]]:
> > [./././././././.][./././././././.][B/././././././.]
> > [./././././././.]
> > [node03.cluster:15282] MCW rank 5 bound to socket 2[core 17[hwt 0]]:
> > [./././././././.][./././././././.][./B/./././././.]
> > [./././././././.]
> > [node03.cluster:15282] MCW rank 6 bound to socket 3[core 24[hwt 0]]:
> > [./././././././.][./././././././.][./././././././.]
> > [B/././././././.]
> > [node03.cluster:15282] MCW rank 7 bound to socket 3[core 25[hwt 0]]:
> > [./././././././.][./././././././.][./././././././.]
> > [./B/./././././.]
> > [node03.cluster:15282] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> > [B/././././././.][./././././././.][./././././././.][
> > ./././././././.]
> > [node03.cluster:15282] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> > [./B/./././././.][./././././././.][./././././././.][
> > ./././././././.]
> > Hello world from process 2 of 8
> > Hello world from process 5 of 8
> > Hello world from process 4 of 8
> > Hello world from process 3 of 8
> > Hello world from process 1 of 8
> > Hello world from process 7 of 8
> > Hello world from process 6 of 8
> > Hello world from process 0 of 8
> > [mishima@node03 demos]$ ~/opt/hwloc/bin/lstopo-no-graphics
> > Machine (126GB)
> >  Socket L#0 (32GB)
> >NUMANode L#0 (P#0 16GB) + L3 L#0 (5118KB)
> >  L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0 + PU
L#0
> > (P#0)
> >  L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1 + PU
L#1
> > (P#1)
> >  L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2 + PU
L#2
> > (P#2)
> >  L2 L#3 (512KB) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3 + PU
L#3
> > (P#3)
> >NUMANode L#1 (P#1 16GB) + L3 L#1 (5118KB)
> >  L2 L#4 (512KB) + L1d L#4 (64KB) + L1i L#4 (64KB) + Core L#4 + PU
L#4
> > (P#4)
> >  L2 L#5 (512KB) + L1d L#5 (64KB) + L1i L#5 (64KB) + Core L#5 + PU
L#5
> > (P#5)
> >  L2 L#6 (512KB) + L1d L#6 (64KB) + L1i L#6 (64KB) + Core L#6 + PU
L#6
> > (P#6)
> >  L2 L#7 (512KB) + L1d L#7 (64KB) + L1i L#7 (64KB) + Core L#7 + PU
L#7
> > (P#7)
> >  Socket L#1 (32GB)
> >NUMANode L#2 (P#6 16GB) + L3 L#2 (5118KB)
> >  L2 L#8 (512KB) + L1d L#8 (64KB) + L1i L#8 (64KB) + Core L#8 + PU
L#8
> > (P#8)
> >  L2 L#9 (512KB) + L1d L#9 (64KB) + L1i L#9 (64KB) + Core L#9 + PU
L#9
> > (P#9)
> >  L2 L#10 (512KB) + L1d L#10 (64KB) + L1i L#10 (64KB) + Core L#10 +
PU
> > L#10 (P#10)
> >  L2 L#11 (512KB) + L1d L#11 (64KB) + L1i L#11 (64KB) + Core L#11 +
PU
> > L#11 (P#11)
> >NUMANode L#3 (P#7 16GB) + L3 L#3 (5118KB)
> >  L2 L#12 (512KB) + L1d L#12 (64KB) + L1i L#12 (64KB) + Core L#12 +
PU
> > L#12 (P#12)
> >  L2 L#13 (512KB) + L1d L#13 (64KB) + L1i L#13 (64KB) + Core L#13 +
PU
> > L#13 (P#13)
> >  L2 L#14 (512KB) + L1d L#14 (64KB) + L1i L#14 (64KB) + Core L#14 +
PU
> > L#14 (P#14)
> >  L2 L#15 (512KB) + L1d L#15 (64KB) + L1i L#15 (64KB) + Core L#15 +
PU
> > L#15 (P#15)
> >  Socket L#2 (32GB)
> >NUMANode L#4 (P#4 16GB) + L3 L#4 (5118KB)
> >  L2 L#16 (512KB) + L1d L#16 (64KB) + L1i L#16 (64KB) + Core L#16 +
PU
> > L#16 (P#16)
> >  L2 L#17 (512KB) + L1d L#17 (64KB) + L1i L#17 (64KB) + Core L#17 +
PU
> > L#17 (P#17)
> >  L2 L#18 (512KB) 

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-18 Thread Ralph Castain
Hmm...try adding "-mca rmaps_base_verbose 10 -mca ess_base_verbose 5" to your 
cmd line and let's see what it thinks it found.


On Dec 18, 2013, at 6:55 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi, I report one more problem with openmpi-1.7.4rc1,
> which is more serious.
> 
> For our 32 core nodes(AMD magny cours based) which has
> 8 numa-nodes, "-bind-to numa" does not work. Without
> this option, it works. For your infomation, at the
> bottom of this mail, I added the lstopo information
> of the node.
> 
> Regards,
> Tetsuya Mishima
> 
> [mishima@manage ~]$ qsub -I -l nodes=1:ppn=32
> qsub: waiting for job 8352.manage.cluster to start
> qsub: job 8352.manage.cluster ready
> 
> [mishima@node03 demos]$ mpirun -np 8 -report-bindings -bind-to numa myprog
> [node03.cluster:15316] [[37582,0],0] bind:upward target NUMANode type
> Machine
> --
> A request was made to bind to NUMA, but an appropriate target could not
> be found on node node03.
> --
> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> [mishima@node03 demos]$ mpirun -np 8 -report-bindings myprog
> [node03.cluster:15282] MCW rank 2 bound to socket 1[core 8[hwt 0]]:
> [./././././././.][B/././././././.][./././././././.][
> ./././././././.]
> [node03.cluster:15282] MCW rank 3 bound to socket 1[core 9[hwt 0]]:
> [./././././././.][./B/./././././.][./././././././.][
> ./././././././.]
> [node03.cluster:15282] MCW rank 4 bound to socket 2[core 16[hwt 0]]:
> [./././././././.][./././././././.][B/././././././.]
> [./././././././.]
> [node03.cluster:15282] MCW rank 5 bound to socket 2[core 17[hwt 0]]:
> [./././././././.][./././././././.][./B/./././././.]
> [./././././././.]
> [node03.cluster:15282] MCW rank 6 bound to socket 3[core 24[hwt 0]]:
> [./././././././.][./././././././.][./././././././.]
> [B/././././././.]
> [node03.cluster:15282] MCW rank 7 bound to socket 3[core 25[hwt 0]]:
> [./././././././.][./././././././.][./././././././.]
> [./B/./././././.]
> [node03.cluster:15282] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> [B/././././././.][./././././././.][./././././././.][
> ./././././././.]
> [node03.cluster:15282] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> [./B/./././././.][./././././././.][./././././././.][
> ./././././././.]
> Hello world from process 2 of 8
> Hello world from process 5 of 8
> Hello world from process 4 of 8
> Hello world from process 3 of 8
> Hello world from process 1 of 8
> Hello world from process 7 of 8
> Hello world from process 6 of 8
> Hello world from process 0 of 8
> [mishima@node03 demos]$ ~/opt/hwloc/bin/lstopo-no-graphics
> Machine (126GB)
>  Socket L#0 (32GB)
>NUMANode L#0 (P#0 16GB) + L3 L#0 (5118KB)
>  L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0 + PU L#0
> (P#0)
>  L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1 + PU L#1
> (P#1)
>  L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2 + PU L#2
> (P#2)
>  L2 L#3 (512KB) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3 + PU L#3
> (P#3)
>NUMANode L#1 (P#1 16GB) + L3 L#1 (5118KB)
>  L2 L#4 (512KB) + L1d L#4 (64KB) + L1i L#4 (64KB) + Core L#4 + PU L#4
> (P#4)
>  L2 L#5 (512KB) + L1d L#5 (64KB) + L1i L#5 (64KB) + Core L#5 + PU L#5
> (P#5)
>  L2 L#6 (512KB) + L1d L#6 (64KB) + L1i L#6 (64KB) + Core L#6 + PU L#6
> (P#6)
>  L2 L#7 (512KB) + L1d L#7 (64KB) + L1i L#7 (64KB) + Core L#7 + PU L#7
> (P#7)
>  Socket L#1 (32GB)
>NUMANode L#2 (P#6 16GB) + L3 L#2 (5118KB)
>  L2 L#8 (512KB) + L1d L#8 (64KB) + L1i L#8 (64KB) + Core L#8 + PU L#8
> (P#8)
>  L2 L#9 (512KB) + L1d L#9 (64KB) + L1i L#9 (64KB) + Core L#9 + PU L#9
> (P#9)
>  L2 L#10 (512KB) + L1d L#10 (64KB) + L1i L#10 (64KB) + Core L#10 + PU
> L#10 (P#10)
>  L2 L#11 (512KB) + L1d L#11 (64KB) + L1i L#11 (64KB) + Core L#11 + PU
> L#11 (P#11)
>NUMANode L#3 (P#7 16GB) + L3 L#3 (5118KB)
>  L2 L#12 (512KB) + L1d L#12 (64KB) + L1i L#12 (64KB) + Core L#12 + PU
> L#12 (P#12)
>  L2 L#13 (512KB) + L1d L#13 (64KB) + L1i L#13 (64KB) + Core L#13 + PU
> L#13 (P#13)
>  L2 L#14 (512KB) + L1d L#14 (64KB) + L1i L#14 (64KB) + Core L#14 + PU
> L#14 (P#14)
>  L2 L#15 (512KB) + L1d L#15 (64KB) + L1i L#15 (64KB) + Core L#15 + PU
> L#15 (P#15)
>  Socket L#2 (32GB)
>NUMANode L#4 (P#4 16GB) + L3 L#4 (5118KB)
>  L2 L#16 (512KB) + L1d L#16 (64KB) + L1i L#16 (64KB) + Core L#16 + PU
> L#16 (P#16)
>  L2 L#17 (512KB) + L1d L#17 (64KB) + L1i L#17 (64KB) + Core L#17 + PU
> L#17 (P#17)
>  L2 L#18 (512KB) + L1d L#18 (64KB) + L1i L#18 (64KB) + Core L#18 + PU
> L#18 (P#18)
>  L2 L#19 (512KB) + L1d L#19 (64KB) + L1i L#19 (64KB) + Core L#19 + PU
> L#19 (P#19)
>NUMANode L#5 (P#5 16GB) + L3 L#5 (5118KB)
>  L2 L#20 (512KB) + L1d L#20 (64KB) + L1i L#20 (64KB) + Core L#20 + PU
> L#20 (P#20)
>  L2 L#21 (512KB) + L1d L#21 (64KB) + L1i L#21 (64KB) + Core L#21 + PU
> L#21 (P#21)
>  

[OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-18 Thread tmishima


Hi, I report one more problem with openmpi-1.7.4rc1,
which is more serious.

For our 32 core nodes(AMD magny cours based) which has
8 numa-nodes, "-bind-to numa" does not work. Without
this option, it works. For your infomation, at the
bottom of this mail, I added the lstopo information
of the node.

Regards,
Tetsuya Mishima

[mishima@manage ~]$ qsub -I -l nodes=1:ppn=32
qsub: waiting for job 8352.manage.cluster to start
qsub: job 8352.manage.cluster ready

[mishima@node03 demos]$ mpirun -np 8 -report-bindings -bind-to numa myprog
[node03.cluster:15316] [[37582,0],0] bind:upward target NUMANode type
Machine
--
A request was made to bind to NUMA, but an appropriate target could not
be found on node node03.
--
[mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
[mishima@node03 demos]$ mpirun -np 8 -report-bindings myprog
[node03.cluster:15282] MCW rank 2 bound to socket 1[core 8[hwt 0]]:
[./././././././.][B/././././././.][./././././././.][
./././././././.]
[node03.cluster:15282] MCW rank 3 bound to socket 1[core 9[hwt 0]]:
[./././././././.][./B/./././././.][./././././././.][
./././././././.]
[node03.cluster:15282] MCW rank 4 bound to socket 2[core 16[hwt 0]]:
[./././././././.][./././././././.][B/././././././.]
[./././././././.]
[node03.cluster:15282] MCW rank 5 bound to socket 2[core 17[hwt 0]]:
[./././././././.][./././././././.][./B/./././././.]
[./././././././.]
[node03.cluster:15282] MCW rank 6 bound to socket 3[core 24[hwt 0]]:
[./././././././.][./././././././.][./././././././.]
[B/././././././.]
[node03.cluster:15282] MCW rank 7 bound to socket 3[core 25[hwt 0]]:
[./././././././.][./././././././.][./././././././.]
[./B/./././././.]
[node03.cluster:15282] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././././.][./././././././.][./././././././.][
./././././././.]
[node03.cluster:15282] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././././.][./././././././.][./././././././.][
./././././././.]
Hello world from process 2 of 8
Hello world from process 5 of 8
Hello world from process 4 of 8
Hello world from process 3 of 8
Hello world from process 1 of 8
Hello world from process 7 of 8
Hello world from process 6 of 8
Hello world from process 0 of 8
[mishima@node03 demos]$ ~/opt/hwloc/bin/lstopo-no-graphics
Machine (126GB)
  Socket L#0 (32GB)
NUMANode L#0 (P#0 16GB) + L3 L#0 (5118KB)
  L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0 + PU L#0
(P#0)
  L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1 + PU L#1
(P#1)
  L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2 + PU L#2
(P#2)
  L2 L#3 (512KB) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3 + PU L#3
(P#3)
NUMANode L#1 (P#1 16GB) + L3 L#1 (5118KB)
  L2 L#4 (512KB) + L1d L#4 (64KB) + L1i L#4 (64KB) + Core L#4 + PU L#4
(P#4)
  L2 L#5 (512KB) + L1d L#5 (64KB) + L1i L#5 (64KB) + Core L#5 + PU L#5
(P#5)
  L2 L#6 (512KB) + L1d L#6 (64KB) + L1i L#6 (64KB) + Core L#6 + PU L#6
(P#6)
  L2 L#7 (512KB) + L1d L#7 (64KB) + L1i L#7 (64KB) + Core L#7 + PU L#7
(P#7)
  Socket L#1 (32GB)
NUMANode L#2 (P#6 16GB) + L3 L#2 (5118KB)
  L2 L#8 (512KB) + L1d L#8 (64KB) + L1i L#8 (64KB) + Core L#8 + PU L#8
(P#8)
  L2 L#9 (512KB) + L1d L#9 (64KB) + L1i L#9 (64KB) + Core L#9 + PU L#9
(P#9)
  L2 L#10 (512KB) + L1d L#10 (64KB) + L1i L#10 (64KB) + Core L#10 + PU
L#10 (P#10)
  L2 L#11 (512KB) + L1d L#11 (64KB) + L1i L#11 (64KB) + Core L#11 + PU
L#11 (P#11)
NUMANode L#3 (P#7 16GB) + L3 L#3 (5118KB)
  L2 L#12 (512KB) + L1d L#12 (64KB) + L1i L#12 (64KB) + Core L#12 + PU
L#12 (P#12)
  L2 L#13 (512KB) + L1d L#13 (64KB) + L1i L#13 (64KB) + Core L#13 + PU
L#13 (P#13)
  L2 L#14 (512KB) + L1d L#14 (64KB) + L1i L#14 (64KB) + Core L#14 + PU
L#14 (P#14)
  L2 L#15 (512KB) + L1d L#15 (64KB) + L1i L#15 (64KB) + Core L#15 + PU
L#15 (P#15)
  Socket L#2 (32GB)
NUMANode L#4 (P#4 16GB) + L3 L#4 (5118KB)
  L2 L#16 (512KB) + L1d L#16 (64KB) + L1i L#16 (64KB) + Core L#16 + PU
L#16 (P#16)
  L2 L#17 (512KB) + L1d L#17 (64KB) + L1i L#17 (64KB) + Core L#17 + PU
L#17 (P#17)
  L2 L#18 (512KB) + L1d L#18 (64KB) + L1i L#18 (64KB) + Core L#18 + PU
L#18 (P#18)
  L2 L#19 (512KB) + L1d L#19 (64KB) + L1i L#19 (64KB) + Core L#19 + PU
L#19 (P#19)
NUMANode L#5 (P#5 16GB) + L3 L#5 (5118KB)
  L2 L#20 (512KB) + L1d L#20 (64KB) + L1i L#20 (64KB) + Core L#20 + PU
L#20 (P#20)
  L2 L#21 (512KB) + L1d L#21 (64KB) + L1i L#21 (64KB) + Core L#21 + PU
L#21 (P#21)
  L2 L#22 (512KB) + L1d L#22 (64KB) + L1i L#22 (64KB) + Core L#22 + PU
L#22 (P#22)
  L2 L#23 (512KB) + L1d L#23 (64KB) + L1i L#23 (64KB) + Core L#23 + PU
L#23 (P#23)
  Socket L#3 (32GB)
NUMANode L#6 (P#2 16GB) + L3 L#6 (5118KB)
  L2 L#24 (512KB) + L1d L#24 (64KB) + L1i L#24 (64KB) + Core L#24 + PU
L#24 (P#24)
  L2 L#25 (512KB) + L1d L#25 (64KB) + L1i L#25 (64KB) + Core 

Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

2013-12-18 Thread Ralph Castain
Very strange - I can't seem to replicate it. Is there any chance that you have 
< 8 actual cores on node12?


On Dec 18, 2013, at 4:53 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi Ralph, sorry for confusing you.
> 
> At that time, I cut and paste the part of "cat $PBS_NODEFILE".
> I guess I didn't paste the last line by my mistake.
> 
> I retried the test and below one is exactly what I got when I did the test.
> 
> [mishima@manage ~]$ qsub -I -l nodes=node11:ppn=8+node12:ppn=8
> qsub: waiting for job 8338.manage.cluster to start
> qsub: job 8338.manage.cluster ready
> 
> [mishima@node11 ~]$ cat $PBS_NODEFILE
> node11
> node11
> node11
> node11
> node11
> node11
> node11
> node11
> node12
> node12
> node12
> node12
> node12
> node12
> node12
> node12
> [mishima@node11 ~]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings myprog
> --
> A request was made to bind to that would result in binding more
> processes than cpus on a resource:
> 
>   Bind to: CORE
>   Node:node12
>   #processes:  2
>   #cpus:  1
> 
> You can override this protection by adding the "overload-allowed"
> option to your binding directive.
> --
> 
> Regards,
> 
> Tetsuya Mishima
> 
>> I removed the debug in #2 - thanks for reporting it
>> 
>> For #1, it actually looks to me like this is correct. If you look at your
> allocation, there are only 7 slots being allocated on node12, yet you have
> asked for 8 cpus to be assigned (2 procs with 2
>> cpus/proc). So the warning is in fact correct
>> 
>> 
>> On Dec 18, 2013, at 4:04 PM, tmish...@jcity.maeda.co.jp wrote:
>> 
>>> 
>>> 
>>> Hi Ralph, I found that openmpi-1.7.4rc1 was already uploaded. So I'd
> like
>>> to report
>>> 3 issues mainly regarding -cpus-per-proc.
>>> 
>>> 1) When I use 2 nodes(node11,node12), which has 8 cores each(= 2
> sockets X
>>> 4 cores/socket),
>>> it starts to produce the error again as shown below. At least,
>>> openmpi-1.7.4a1r29646 did
>>> work well.
>>> 
>>> [mishima@manage ~]$ qsub -I -l nodes=2:ppn=8
>>> qsub: waiting for job 8336.manage.cluster to start
>>> qsub: job 8336.manage.cluster ready
>>> 
>>> [mishima@node11 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>> [mishima@node11 demos]$ cat $PBS_NODEFILE
>>> node11
>>> node11
>>> node11
>>> node11
>>> node11
>>> node11
>>> node11
>>> node11
>>> node12
>>> node12
>>> node12
>>> node12
>>> node12
>>> node12
>>> node12
>>> [mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
>>> myprog
>>> 
> --
>>> A request was made to bind to that would result in binding more
>>> processes than cpus on a resource:
>>> 
>>>  Bind to: CORE
>>>  Node:node12
>>>  #processes:  2
>>>  #cpus:  1
>>> 
>>> You can override this protection by adding the "overload-allowed"
>>> option to your binding directive.
>>> 
> --
>>> 
>>> Of course it works well using only one node.
>>> 
>>> [mishima@node11 demos]$ mpirun -np 2 -cpus-per-proc 4 -report-bindings
>>> myprog
>>> [node11.cluster:26238] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> socket
>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>> [node11.cluster:26238] MCW rank 1 bound to socket 1[core 4[hwt 0]],
> socket
>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>> Hello world from process 1 of 2
>>> Hello world from process 0 of 2
>>> 
>>> 
>>> 2) Adding "-bind-to numa", it works but the message "bind:upward target
>>> NUMANode type NUMANode" appears.
>>> As far as I remember, I didn't see such a kind of message before.
>>> 
>>> mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
>>> -bind-to numa myprog
>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
>>> NUMANode
>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
>>> NUMANode
>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
>>> NUMANode
>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
>>> NUMANode
>>> [node11.cluster:26260] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> socket
>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>> [node11.cluster:26260] MCW rank 1 bound to socket 1[core 4[hwt 0]],
> socket
>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>> [node12.cluster:23607] MCW rank 3 bound to socket 1[core 4[hwt 0]],
> socket
>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>> [node12.cluster:23607] MCW rank 2 bound to socket 0[core 0[hwt 0]],
> socket
>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>> cket 0[core 3[hwt 0]]: 

Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

2013-12-18 Thread tmishima


Hi Ralph, sorry for confusing you.

At that time, I cut and paste the part of "cat $PBS_NODEFILE".
I guess I didn't paste the last line by my mistake.

I retried the test and below one is exactly what I got when I did the test.

[mishima@manage ~]$ qsub -I -l nodes=node11:ppn=8+node12:ppn=8
qsub: waiting for job 8338.manage.cluster to start
qsub: job 8338.manage.cluster ready

[mishima@node11 ~]$ cat $PBS_NODEFILE
node11
node11
node11
node11
node11
node11
node11
node11
node12
node12
node12
node12
node12
node12
node12
node12
[mishima@node11 ~]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings myprog
--
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to: CORE
   Node:node12
   #processes:  2
   #cpus:  1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
--

Regards,

Tetsuya Mishima

> I removed the debug in #2 - thanks for reporting it
>
> For #1, it actually looks to me like this is correct. If you look at your
allocation, there are only 7 slots being allocated on node12, yet you have
asked for 8 cpus to be assigned (2 procs with 2
> cpus/proc). So the warning is in fact correct
>
>
> On Dec 18, 2013, at 4:04 PM, tmish...@jcity.maeda.co.jp wrote:
>
> >
> >
> > Hi Ralph, I found that openmpi-1.7.4rc1 was already uploaded. So I'd
like
> > to report
> > 3 issues mainly regarding -cpus-per-proc.
> >
> > 1) When I use 2 nodes(node11,node12), which has 8 cores each(= 2
sockets X
> > 4 cores/socket),
> > it starts to produce the error again as shown below. At least,
> > openmpi-1.7.4a1r29646 did
> > work well.
> >
> > [mishima@manage ~]$ qsub -I -l nodes=2:ppn=8
> > qsub: waiting for job 8336.manage.cluster to start
> > qsub: job 8336.manage.cluster ready
> >
> > [mishima@node11 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> > [mishima@node11 demos]$ cat $PBS_NODEFILE
> > node11
> > node11
> > node11
> > node11
> > node11
> > node11
> > node11
> > node11
> > node12
> > node12
> > node12
> > node12
> > node12
> > node12
> > node12
> > [mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
> > myprog
> >
--
> > A request was made to bind to that would result in binding more
> > processes than cpus on a resource:
> >
> >   Bind to: CORE
> >   Node:node12
> >   #processes:  2
> >   #cpus:  1
> >
> > You can override this protection by adding the "overload-allowed"
> > option to your binding directive.
> >
--
> >
> > Of course it works well using only one node.
> >
> > [mishima@node11 demos]$ mpirun -np 2 -cpus-per-proc 4 -report-bindings
> > myprog
> > [node11.cluster:26238] MCW rank 0 bound to socket 0[core 0[hwt 0]],
socket
> > 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
> > [node11.cluster:26238] MCW rank 1 bound to socket 1[core 4[hwt 0]],
socket
> > 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
> > cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
> > Hello world from process 1 of 2
> > Hello world from process 0 of 2
> >
> >
> > 2) Adding "-bind-to numa", it works but the message "bind:upward target
> > NUMANode type NUMANode" appears.
> > As far as I remember, I didn't see such a kind of message before.
> >
> > mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
> > -bind-to numa myprog
> > [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
> > NUMANode
> > [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
> > NUMANode
> > [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
> > NUMANode
> > [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
> > NUMANode
> > [node11.cluster:26260] MCW rank 0 bound to socket 0[core 0[hwt 0]],
socket
> > 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
> > [node11.cluster:26260] MCW rank 1 bound to socket 1[core 4[hwt 0]],
socket
> > 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
> > cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
> > [node12.cluster:23607] MCW rank 3 bound to socket 1[core 4[hwt 0]],
socket
> > 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
> > cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
> > [node12.cluster:23607] MCW rank 2 bound to socket 0[core 0[hwt 0]],
socket
> > 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
> > Hello world from process 1 of 4
> > Hello world from process 0 of 4
> > Hello world from process 3 of 4
> > Hello world from process 2 of 4
> >
> >
> > 3) I use PGI compiler. It can not accept compiler switch
> > "-Wno-variadic-macros", which is
> > included in configure script.
> >
> >

Re: [OMPI users] What's the status of OpenMPI and thread safety?

2013-12-18 Thread Ralph Castain
This was, in fact, a primary point of discussion at last week's OMPI 
developer's conference. Bottom line is that we are only a little further along 
than we used to be, but are focusing on improving it. You'll find good thread 
support for some transports (some of the MTLs and at least the TCP BTL), not so 
good for others (e.g., openib is flat-out not thread safe).


On Dec 18, 2013, at 3:57 PM, Blosch, Edwin L  wrote:

> I was wondering if the FAQ entry below is considered current opinion or 
> perhaps a little stale.  Is multi-threading still considered to be ‘lightly 
> tested’?  Are there known open bugs?
>  
> Thank you,
>  
> Ed
>  
>  
> 7. Is Open MPI thread safe?
>  
> Support for MPI_THREAD_MULTIPLE (i.e., multiple threads executing within the 
> MPI library) and asynchronous message passing progress (i.e., continuing 
> message passing operations even while no user threads are in the MPI library) 
> has been designed into Open MPI from its first planning meetings.
>  
> Support for MPI_THREAD_MULTIPLE is included in the first version of Open MPI, 
> but it is only lightly tested and likely still has some bugs. Support for 
> asynchronous progress is included in the TCP point-to-point device, but it, 
> too, has only had light testing and likely still has bugs.
>  
> Completing the testing for full support of MPI_THREAD_MULTIPLE and 
> asynchronous progress is planned in the near future.
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

2013-12-18 Thread Ralph Castain
I removed the debug in #2 - thanks for reporting it

For #1, it actually looks to me like this is correct. If you look at your 
allocation, there are only 7 slots being allocated on node12, yet you have 
asked for 8 cpus to be assigned (2 procs with 2 cpus/proc). So the warning is 
in fact correct


On Dec 18, 2013, at 4:04 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi Ralph, I found that openmpi-1.7.4rc1 was already uploaded. So I'd like
> to report
> 3 issues mainly regarding -cpus-per-proc.
> 
> 1) When I use 2 nodes(node11,node12), which has 8 cores each(= 2 sockets X
> 4 cores/socket),
> it starts to produce the error again as shown below. At least,
> openmpi-1.7.4a1r29646 did
> work well.
> 
> [mishima@manage ~]$ qsub -I -l nodes=2:ppn=8
> qsub: waiting for job 8336.manage.cluster to start
> qsub: job 8336.manage.cluster ready
> 
> [mishima@node11 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> [mishima@node11 demos]$ cat $PBS_NODEFILE
> node11
> node11
> node11
> node11
> node11
> node11
> node11
> node11
> node12
> node12
> node12
> node12
> node12
> node12
> node12
> [mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
> myprog
> --
> A request was made to bind to that would result in binding more
> processes than cpus on a resource:
> 
>   Bind to: CORE
>   Node:node12
>   #processes:  2
>   #cpus:  1
> 
> You can override this protection by adding the "overload-allowed"
> option to your binding directive.
> --
> 
> Of course it works well using only one node.
> 
> [mishima@node11 demos]$ mpirun -np 2 -cpus-per-proc 4 -report-bindings
> myprog
> [node11.cluster:26238] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
> [node11.cluster:26238] MCW rank 1 bound to socket 1[core 4[hwt 0]], socket
> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
> Hello world from process 1 of 2
> Hello world from process 0 of 2
> 
> 
> 2) Adding "-bind-to numa", it works but the message "bind:upward target
> NUMANode type NUMANode" appears.
> As far as I remember, I didn't see such a kind of message before.
> 
> mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
> -bind-to numa myprog
> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
> NUMANode
> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
> NUMANode
> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
> NUMANode
> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
> NUMANode
> [node11.cluster:26260] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
> [node11.cluster:26260] MCW rank 1 bound to socket 1[core 4[hwt 0]], socket
> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
> [node12.cluster:23607] MCW rank 3 bound to socket 1[core 4[hwt 0]], socket
> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
> [node12.cluster:23607] MCW rank 2 bound to socket 0[core 0[hwt 0]], socket
> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
> Hello world from process 1 of 4
> Hello world from process 0 of 4
> Hello world from process 3 of 4
> Hello world from process 2 of 4
> 
> 
> 3) I use PGI compiler. It can not accept compiler switch
> "-Wno-variadic-macros", which is
> included in configure script.
> 
>   btl_usnic_CFLAGS="-Wno-variadic-macros"
> 
> I removed this switch, then I could continue to build 1.7.4rc1.
> 
> Regards,
> Tetsuya Mishima
> 
> 
>> Hmmm...okay, I understand the scenario. Must be something in the algo
> when it only has one node, so it shouldn't be too hard to track down.
>> 
>> I'm off on travel for a few days, but will return to this when I get
> back.
>> 
>> Sorry for delay - will try to look at this while I'm gone, but can't
> promise anything :-(
>> 
>> 
>> On Dec 10, 2013, at 6:58 PM, tmish...@jcity.maeda.co.jp wrote:
>> 
>>> 
>>> 
>>> Hi Ralph, sorry for confusing.
>>> 
>>> We usually logon to "manage", which is our control node.
>>> From manage, we submit job or enter a remote node such as
>>> node03 by torque interactive mode(qsub -I).
>>> 
>>> At that time, instead of torque, I just did rsh to node03 from manage
>>> and ran myprog on the node. I hope you could understand what I did.
>>> 
>>> Now, I retried with "-host node03", which still causes the problem:
>>> (I comfirmed local run on manage caused the same problem too)
>>> 
>>> [mishima@manage ~]$ rsh node03
>>> Last login: Wed Dec 11 11:38:57 from manage
>>> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>> [mishima@node03 demos]$
>>> [mishima@node03 

Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

2013-12-18 Thread Jeff Squyres (jsquyres)
On Dec 18, 2013, at 7:04 PM,  
 wrote:

> 3) I use PGI compiler. It can not accept compiler switch
> "-Wno-variadic-macros", which is
> included in configure script.
> 
>   btl_usnic_CFLAGS="-Wno-variadic-macros"

Yoinks.  I'll fix (that flag is only intended for our private copy of v1.6 -- 
trunk/v1.7 are C99, and that flag isn't necessary).

Thanks for pointing out the problem!

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

2013-12-18 Thread tmishima


Hi Ralph, I found that openmpi-1.7.4rc1 was already uploaded. So I'd like
to report
3 issues mainly regarding -cpus-per-proc.

1) When I use 2 nodes(node11,node12), which has 8 cores each(= 2 sockets X
4 cores/socket),
it starts to produce the error again as shown below. At least,
openmpi-1.7.4a1r29646 did
work well.

[mishima@manage ~]$ qsub -I -l nodes=2:ppn=8
qsub: waiting for job 8336.manage.cluster to start
qsub: job 8336.manage.cluster ready

[mishima@node11 ~]$ cd ~/Desktop/openmpi-1.7/demos/
[mishima@node11 demos]$ cat $PBS_NODEFILE
node11
node11
node11
node11
node11
node11
node11
node11
node12
node12
node12
node12
node12
node12
node12
[mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
myprog
--
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to: CORE
   Node:node12
   #processes:  2
   #cpus:  1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
--

Of course it works well using only one node.

[mishima@node11 demos]$ mpirun -np 2 -cpus-per-proc 4 -report-bindings
myprog
[node11.cluster:26238] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
[node11.cluster:26238] MCW rank 1 bound to socket 1[core 4[hwt 0]], socket
1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
Hello world from process 1 of 2
Hello world from process 0 of 2


2) Adding "-bind-to numa", it works but the message "bind:upward target
NUMANode type NUMANode" appears.
As far as I remember, I didn't see such a kind of message before.

mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
-bind-to numa myprog
[node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
NUMANode
[node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
NUMANode
[node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
NUMANode
[node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
NUMANode
[node11.cluster:26260] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
[node11.cluster:26260] MCW rank 1 bound to socket 1[core 4[hwt 0]], socket
1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
[node12.cluster:23607] MCW rank 3 bound to socket 1[core 4[hwt 0]], socket
1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
[node12.cluster:23607] MCW rank 2 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
Hello world from process 1 of 4
Hello world from process 0 of 4
Hello world from process 3 of 4
Hello world from process 2 of 4


3) I use PGI compiler. It can not accept compiler switch
"-Wno-variadic-macros", which is
included in configure script.

btl_usnic_CFLAGS="-Wno-variadic-macros"

I removed this switch, then I could continue to build 1.7.4rc1.

Regards,
Tetsuya Mishima


> Hmmm...okay, I understand the scenario. Must be something in the algo
when it only has one node, so it shouldn't be too hard to track down.
>
> I'm off on travel for a few days, but will return to this when I get
back.
>
> Sorry for delay - will try to look at this while I'm gone, but can't
promise anything :-(
>
>
> On Dec 10, 2013, at 6:58 PM, tmish...@jcity.maeda.co.jp wrote:
>
> >
> >
> > Hi Ralph, sorry for confusing.
> >
> > We usually logon to "manage", which is our control node.
> > From manage, we submit job or enter a remote node such as
> > node03 by torque interactive mode(qsub -I).
> >
> > At that time, instead of torque, I just did rsh to node03 from manage
> > and ran myprog on the node. I hope you could understand what I did.
> >
> > Now, I retried with "-host node03", which still causes the problem:
> > (I comfirmed local run on manage caused the same problem too)
> >
> > [mishima@manage ~]$ rsh node03
> > Last login: Wed Dec 11 11:38:57 from manage
> > [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> > [mishima@node03 demos]$
> > [mishima@node03 demos]$ mpirun -np 8 -host node03 -report-bindings
> > -cpus-per-proc 4 -map-by socket myprog
> >
--
> > A request was made to bind to that would result in binding more
> > processes than cpus on a resource:
> >
> >   Bind to: CORE
> >   Node:node03
> >   #processes:  2
> >   #cpus:  1
> >
> > You can override this protection by adding the "overload-allowed"
> > option to your binding directive.
> >
--
> >
> > It' strange, 

[OMPI users] What's the status of OpenMPI and thread safety?

2013-12-18 Thread Blosch, Edwin L
I was wondering if the FAQ entry below is considered current opinion or perhaps 
a little stale.  Is multi-threading still considered to be 'lightly tested'?  
Are there known open bugs?

Thank you,

Ed


7. Is Open MPI thread safe?

Support for MPI_THREAD_MULTIPLE (i.e., multiple threads executing within the 
MPI library) and asynchronous message passing progress (i.e., continuing 
message passing operations even while no user threads are in the MPI library) 
has been designed into Open MPI from its first planning meetings.

Support for MPI_THREAD_MULTIPLE is included in the first version of Open MPI, 
but it is only lightly tested and likely still has some bugs. Support for 
asynchronous progress is included in the TCP point-to-point device, but it, 
too, has only had light testing and likely still has bugs.

Completing the testing for full support of MPI_THREAD_MULTIPLE and asynchronous 
progress is planned in the near future.



Re: [OMPI users] tcp of openmpi-1.7.3 under our environment is very slow

2013-12-18 Thread tmishima


Hi Jeff,

I did with processor binding enabled using both of openmpi-1.7.3
and 1.7.4rc1. But I got the same results as no binding.

In addition, core mapping of 1.7.4rc1 seems to be strange, which
has no relation with tcp slowdown.

Regards,
Tetsuya Mishima


[mishima@node08 OMB-3.1.1]$ mpirun -V
mpirun (Open MPI) 1.7.3

Report bugs to http://www.open-mpi.org/community/help/
[mishima@node08 OMB-3.1.1]$ mpirun -np 2 -host node08,node09 -mca btl
^openib -bind-to core -report-bindings osu_bw
[node08.cluster:23950] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
[node09.cluster:23477] MCW rank 1 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.00
2 0.01
4 0.01
8 0.02
160.05
320.09
646.49
128   0.39
256   1.74
512   9.51
1024 26.59
2048182.55
4096202.52
8192217.44
16384   227.91
32768   231.11
65536   112.57
131072  217.01
262144  215.49
524288  233.97
1048576 231.33
2097152 235.04
4194304 234.77
[mishima@node08 OMB-3.1.1]$ mpirun -np 2 -host node08,node09 -mca btl
^openib -bind-to core -report-bindings osu_latency
[node08.cluster:23968] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
[node09.cluster:23522] MCW rank 1 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
# OSU MPI Latency Test v3.1.1
# SizeLatency (us)
018.08
118.46
218.37
418.45
818.96
16   18.98
32   19.31
64   19.83
128  20.24
256  21.86
512  24.74
1024 30.02
2048 71.07
4096 73.64
8192106.67
16384   176.36
32768   250.88
65536 20188.73
13107221141.11
26214418462.47
52428824940.10
1048576   26160.76
2097152   29538.91
4194304   42420.03


[mishima@node08 OMB-3.1.1]$ mpirun -V
mpirun (Open MPI) 1.7.4rc1

Report bugs to http://www.open-mpi.org/community/help/
[mishima@node08 OMB-3.1.1]$ mpirun -np 2 -host node08,node09 -mca btl
^openib -bind-to core -report-bindings osu_bw
[node08.cluster:23932] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
[node09.cluster:23409] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./.][./././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.00
2 0.01
4 0.01
8 0.03
160.05
320.08
646.35
128   0.34
256   3.79
512   8.38
1024  9.53
2048182.12
4096203.16
8192215.49
16384   228.56
32768   231.28
65536   134.46
131072  217.33
262144  226.90
524288  220.98
1048576 234.73
2097152 232.56
4194304 234.78
[mishima@node08 OMB-3.1.1]$ mpirun -np 2 -host node08,node09 -mca btl
^openib -bind-to core -report-bindings osu_latency
[node08.cluster:23940] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
[node09.cluster:23443] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./.][./././.]
# OSU MPI Latency Test v3.1.1
# SizeLatency (us)
019.99
119.79
219.87
420.04
819.99
16   20.00
32   20.12
64   20.85
128  21.27
256  22.73
512  25.57
1024 31.25
2048 41.68
4096 56.41
8192 90.48
16384   177.76
32768   252.26
65536 20489.12
13107221235.08
26214420278.82
52428824009.70
1048576   25395.96
2097152   30260.70
4194304   41058.17

> Can you re-run these tests with processor binding enabled?
>
> On Dec 16, 2013, at 6:36 PM, tmish...@jcity.maeda.co.jp wrote:
>
> >
> >
> > Hi,
> >
> > I 

Re: [OMPI users] [OMPI devel] Recommended tool to measure packet counters

2013-12-18 Thread Siddhartha Jana
Ah got it ! Thanks

-- Sid


On 18 December 2013 07:44, Jeff Squyres (jsquyres) wrote:

> On Dec 14, 2013, at 8:02 AM, Siddhartha Jana 
> wrote:
>
> > Is there a preferred method/tool among developers of MPI-library for
> checking the count of the packets transmitted by the network card during
> two-sided communication?
> >
> > Is the use of
> > iptables -I INPUT -i eth0
> > iptables -I OUTPUT -o eth0
> >
> > recommended ?
>
> If you're using an ethernet, non-OS-bypass transport (e.g., TCP), you
> might also want to look at ethtool.
>
> Note that these counts will include control messages sent by Open MPI, too
> -- not just raw MPI traffic.  They also will not include any traffic sent
> across shared memory (or other transports).
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-18 Thread Martin Siegert
Hi,

expanding on Noam's problem a bit ...

On Wed, Dec 18, 2013 at 10:19:25AM -0500, Noam Bernstein wrote:
> Thanks to all who answered my question.  The culprit was an interaction 
> between
> 1.7.3 not supporting mpi_paffinity_alone (which we were using previously) and 
> the new 
> kernel.  Switching to --bind-to core (actually the environment variable 
> OMPI_MCA_hwloc_base_binding_policy=core) fixed the problem.
> 
> Noam

Thanks for figuring this out. Does this work for 1.6.x as well?
The FAQ http://www.open-mpi.org/faq/?category=tuning#using-paffinity
covers versions 1.2.x to 1.5.x. 
Does 1.6.x support mpi_paffinity_alone = 1 ?
I set this in openmpi-mca-params.conf but

# ompi_info | grep affinity
  MPI extensions: affinity example
   MCA paffinity: hwloc (MCA v2.0, API v2.0, Component v1.6.4)
   MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.6.4)
   MCA maffinity: hwloc (MCA v2.0, API v2.0, Component v1.6.4)

does not give any indication that this is actually used.

Cheers,
Martin

-- 
Martin Siegert
WestGrid/ComputeCanada
Simon Fraser University


Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-18 Thread Ake Sandgren
On Wed, 2013-12-18 at 11:47 -0500, Noam Bernstein wrote: 
> Yes - I never characterized it fully, but we attached with gdb to every
> single vasp running process, and all were stuck in the same
> call to MPI_allreduce() every time. It's only happening on a rather large 
> jobs, so it's not the easiest setup to debug.  

That sounds like one of the bugs i found i VASP.
Could you send me the input data that triggers this (with info on how it
was run, i.e. #mpi-tasks etc) and i can check if our heavily fixed
version hits it.

/Åke S.



[OMPI users] Problem with memory in mpi program

2013-12-18 Thread Yeni Lora
My program it is with MPI and OpenMP, and is a sample program take
much memory, I don't know the memory RAM consume for a mpi program and
I want to know if mpi consume a lot of memory when if used together
openmp or I doing something wrong, for take memory Ram of mi program I
used a file /proc/id_proc/stat, where id_proc if the id of my process.
This is my example program:

#include 
#include "mpi.h"
#include 
#include 
#include 

int main(int argc, char** argv){

int  my_rank; /* rank of process */
 int  p;   /* number of processes */

 MPI_Init_thread(, , MPI::THREAD_MULTIPLE,);

 /* find out process rank */
MPI_Comm_rank(MPI_COMM_WORLD, _rank);

/* find out number of processes */
MPI_Comm_size(MPI_COMM_WORLD, );

 char cad[4];
 MPI_Status status;

omp_set_num_threads(2);
#pragma omp parallel
{
 int h =  omp_get_thread_num();

 if(h==0){
MPI_Send(, 1, MPI::CHAR, my_rank, 
11,MPI_COMM_WORLD);
 }
 else{
std::vector all(2,0);
MPI_Recv(, 2, MPI::CHAR, MPI::ANY_SOURCE,
MPI::ANY_TAG,MPI_COMM_WORLD, );
}
}

/* shut down MPI */
MPI_Finalize();

return 0;
}

Compile:
mpic++ -fopenmp -fno-threadsafe-statics -o sample_program sample_program.c

Run
mpirun  sample_program

and the memory consume: 190MB

Please I need help is very important to me get a low memory consume


Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-18 Thread Brice Goglin
hwloc-ps (and lstopo --top) are better at showing process binding but they lack 
a nice pseudographical interface with dynamic refresh.
htop uses hwloc internally iirc, so there's hope we'll have everything needed 
in htop one day ;)
Brice



Dave Love  a écrit :
>John Hearns  writes:
>
>> 'Htop' is a very good tool for looking at where processes are
>running.
>
>I'd have thought hwloc-ps is the tool for that.
>___
>users mailing list
>us...@open-mpi.org
>http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-18 Thread Dave Love
John Hearns  writes:

> 'Htop' is a very good tool for looking at where processes are running.

I'd have thought hwloc-ps is the tool for that.


Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-18 Thread Dave Love
Noam Bernstein  writes:

> We specifically switched to 1.7.3 because of a bug in 1.6.4 (lock up in some 
> collective communication), but now I'm wondering whether I should just test
> 1.6.5.

What bug, exactly?  As you mentioned vasp, is it specifically affecting
that?

We have seen apparent deadlocks with vasp -- which users assure me is
due to malfunctioning hardware and/or batch system -- but I don't think
there was any evidence of it being due to openmpi (1.4 and 1.6 on
different systems here).  I didn't have the padb --deadlock mode working
properly at the time I looked at one, but it seemed just to be stuck
with some ranks in broadcast and the rest in barrier.  Someone else put
a parallel debugger on it, but I'm not sure if there was a conclusive
result, and I'm not very interested in debugging proprietary programs.


Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-18 Thread Noam Bernstein
Thanks to all who answered my question.  The culprit was an interaction between
1.7.3 not supporting mpi_paffinity_alone (which we were using previously) and 
the new 
kernel.  Switching to --bind-to core (actually the environment variable 
OMPI_MCA_hwloc_base_binding_policy=core) fixed the problem.


Noam

smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI users] [OMPI devel] Recommended tool to measure packet counters

2013-12-18 Thread Jeff Squyres (jsquyres)
On Dec 14, 2013, at 8:02 AM, Siddhartha Jana  wrote:

> Is there a preferred method/tool among developers of MPI-library for checking 
> the count of the packets transmitted by the network card during two-sided 
> communication?
> 
> Is the use of
> iptables -I INPUT -i eth0
> iptables -I OUTPUT -o eth0
> 
> recommended ?

If you're using an ethernet, non-OS-bypass transport (e.g., TCP), you might 
also want to look at ethtool.

Note that these counts will include control messages sent by Open MPI, too -- 
not just raw MPI traffic.  They also will not include any traffic sent across 
shared memory (or other transports).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] tcp of openmpi-1.7.3 under our environment is very slow

2013-12-18 Thread Jeff Squyres (jsquyres)
Can you re-run these tests with processor binding enabled?

On Dec 16, 2013, at 6:36 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi,
> 
> I usually use infiniband network, where openmpi-1.7.3 and 1.6.5 works fine.
> 
> The other days, I had a chance to use tcp network(1GbE) and I noticed that
> my application with openmpi-1.7.3 was quite slower than openmpi-1.6.5.
> So, I did OSU MPI Bandwidth Test v3.1.1 as shown below, which shows
> bandwidth for smaller size(< 1024) is very slow compared with 1.6.5.
> In addition, the latency for larger size( >65536 ) seems to be strange.
> 
> Does this depend on our local environment or some mca parameter would be
> necesarry? I'm afraid that something is wrong with tcp of openmpi-1.7.3.
> 
> openmpi-1.7.3:
> 
> [mishima@node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca tbl
> ^openib osu_bw
> # OSU MPI Bandwidth Test v3.1.1
> # SizeBandwidth (MB/s)
> 1 0.00
> 2 0.01
> 4 0.01
> 8 0.03
> 160.05
> 320.10
> 640.32
> 128   0.37
> 256   0.87
> 512   5.97
> 1024 20.00
> 2048182.87
> 4096202.53
> 8192215.14
> 16384   225.16
> 32768   228.58
> 65536   115.23
> 131072  198.24
> 262144  193.38
> 524288  233.03
> 1048576 227.31
> 2097152 233.07
> 4194304 233.25
> 
> [mishima@node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca btl
> ^openib osu_latency
> # OSU MPI Latency Test v3.1.1
> # SizeLatency (us)
> 019.23
> 119.57
> 219.52
> 419.88
> 820.44
> 16   20.38
> 32   20.78
> 64   21.14
> 128  21.75
> 256  23.20
> 512  26.12
> 1024 31.54
> 2048 41.72
> 4096 64.55
> 8192107.52
> 16384   179.23
> 32768   251.58
> 65536 20689.68
> 13107221179.79
> 26214420168.56
> 52428822984.83
> 1048576   25994.54
> 2097152   30929.55
> 4194304   38028.48
> 
> openmpi-1.6.5:
> 
> [mishima@node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca tbl
> ^openib osu_bw
> # OSU MPI Bandwidth Test v3.1.1
> # SizeBandwidth (MB/s)
> 1 0.22
> 2 0.45
> 4 0.89
> 8 1.77
> 163.57
> 327.15
> 64   14.28
> 128  28.58
> 256  57.17
> 512  96.44
> 1024152.38
> 2048182.84
> 4096203.17
> 8192215.13
> 16384   225.05
> 32768   100.58
> 65536   225.24
> 131072  182.92
> 262144  192.82
> 524288  212.92
> 1048576 233.35
> 2097152 233.72
> 4194304 233.89
> 
> [mishima@node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca btl
> ^openib osu_latency
> # OSU MPI Latency Test v3.1.1
> # SizeLatency (us)
> 017.24
> 117.30
> 217.29
> 417.30
> 824.32
> 16   17.24
> 32   17.80
> 64   17.91
> 128  19.08
> 256  20.81
> 512  22.83
> 1024 27.82
> 2048 39.54
> 4096 52.66
> 8192 97.70
> 16384   143.23
> 32768   215.02
> 65536   481.08
> 131072  800.64
> 262144 1475.12
> 524288 2698.62
> 10485764992.31
> 20971529558.96
> 4194304   20801.50
> 
> Regards,
> Tetsuya Mishima
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/