Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

Ralph Castain Wed, 18 Dec 2013 19:56:04 -0500 (EST)

Very strange - I can't seem to replicate it. Is there any chance that you have 
< 8 actual cores on node12?



On Dec 18, 2013, at 4:53 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi Ralph, sorry for confusing you.
> 
> At that time, I cut and paste the part of "cat $PBS_NODEFILE".
> I guess I didn't paste the last line by my mistake.
> 
> I retried the test and below one is exactly what I got when I did the test.
> 
> [mishima@manage ~]$ qsub -I -l nodes=node11:ppn=8+node12:ppn=8
> qsub: waiting for job 8338.manage.cluster to start
> qsub: job 8338.manage.cluster ready
> 
> [mishima@node11 ~]$ cat $PBS_NODEFILE
> node11
> node11
> node11
> node11
> node11
> node11
> node11
> node11
> node12
> node12
> node12
> node12
> node12
> node12
> node12
> node12
> [mishima@node11 ~]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings myprog
> --------------------------------------------------------------------------
> A request was made to bind to that would result in binding more
> processes than cpus on a resource:
> 
>   Bind to:         CORE
>   Node:            node12
>   #processes:  2
>   #cpus:          1
> 
> You can override this protection by adding the "overload-allowed"
> option to your binding directive.
> --------------------------------------------------------------------------
> 
> Regards,
> 
> Tetsuya Mishima
> 
>> I removed the debug in #2 - thanks for reporting it
>> 
>> For #1, it actually looks to me like this is correct. If you look at your
> allocation, there are only 7 slots being allocated on node12, yet you have
> asked for 8 cpus to be assigned (2 procs with 2
>> cpus/proc). So the warning is in fact correct
>> 
>> 
>> On Dec 18, 2013, at 4:04 PM, tmish...@jcity.maeda.co.jp wrote:
>> 
>>> 
>>> 
>>> Hi Ralph, I found that openmpi-1.7.4rc1 was already uploaded.　So I'd
> like
>>> to report
>>> 3 issues mainly regarding -cpus-per-proc.
>>> 
>>> 1) When I use 2 nodes(node11,node12), which has 8 cores each(= 2
> sockets X
>>> 4 cores/socket),
>>> it starts to produce the error again as shown below. At least,
>>> openmpi-1.7.4a1r29646 did
>>> work well.
>>> 
>>> [mishima@manage ~]$ qsub -I -l nodes=2:ppn=8
>>> qsub: waiting for job 8336.manage.cluster to start
>>> qsub: job 8336.manage.cluster ready
>>> 
>>> [mishima@node11 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>> [mishima@node11 demos]$ cat $PBS_NODEFILE
>>> node11
>>> node11
>>> node11
>>> node11
>>> node11
>>> node11
>>> node11
>>> node11
>>> node12
>>> node12
>>> node12
>>> node12
>>> node12
>>> node12
>>> node12
>>> [mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
>>> myprog
>>> 
> --------------------------------------------------------------------------
>>> A request was made to bind to that would result in binding more
>>> processes than cpus on a resource:
>>> 
>>>  Bind to:         CORE
>>>  Node:            node12
>>>  #processes:  2
>>>  #cpus:          1
>>> 
>>> You can override this protection by adding the "overload-allowed"
>>> option to your binding directive.
>>> 
> --------------------------------------------------------------------------
>>> 
>>> Of course it works well using only one node.
>>> 
>>> [mishima@node11 demos]$ mpirun -np 2 -cpus-per-proc 4 -report-bindings
>>> myprog
>>> [node11.cluster:26238] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> socket
>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>> [node11.cluster:26238] MCW rank 1 bound to socket 1[core 4[hwt 0]],
> socket
>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>> Hello world from process 1 of 2
>>> Hello world from process 0 of 2
>>> 
>>> 
>>> 2) Adding "-bind-to numa", it works but the message "bind:upward target
>>> NUMANode type NUMANode" appears.
>>> As far as I remember, I didn't see such a kind of message before.
>>> 
>>> mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
>>> -bind-to numa myprog
>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
>>> NUMANode
>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
>>> NUMANode
>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
>>> NUMANode
>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode type
>>> NUMANode
>>> [node11.cluster:26260] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> socket
>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>> [node11.cluster:26260] MCW rank 1 bound to socket 1[core 4[hwt 0]],
> socket
>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>> [node12.cluster:23607] MCW rank 3 bound to socket 1[core 4[hwt 0]],
> socket
>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>> [node12.cluster:23607] MCW rank 2 bound to socket 0[core 0[hwt 0]],
> socket
>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>> Hello world from process 1 of 4
>>> Hello world from process 0 of 4
>>> Hello world from process 3 of 4
>>> Hello world from process 2 of 4
>>> 
>>> 
>>> 3) I use PGI compiler. It can not accept compiler switch
>>> "-Wno-variadic-macros", which is
>>> included in configure script.
>>> 
>>>     btl_usnic_CFLAGS="-Wno-variadic-macros"
>>> 
>>> I removed this switch, then I could continue to build 1.7.4rc1.
>>> 
>>> Regards,
>>> Tetsuya Mishima
>>> 
>>> 
>>>> Hmmm...okay, I understand the scenario. Must be something in the algo
>>> when it only has one node, so it shouldn't be too hard to track down.
>>>> 
>>>> I'm off on travel for a few days, but will return to this when I get
>>> back.
>>>> 
>>>> Sorry for delay - will try to look at this while I'm gone, but can't
>>> promise anything :-(
>>>> 
>>>> 
>>>> On Dec 10, 2013, at 6:58 PM, tmish...@jcity.maeda.co.jp wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> Hi Ralph, sorry for confusing.
>>>>> 
>>>>> We usually logon to "manage", which is our control node.
>>>>> From manage, we submit job or enter a remote node such as
>>>>> node03 by torque interactive mode(qsub -I).
>>>>> 
>>>>> At that time, instead of torque, I just did rsh to node03 from manage
>>>>> and ran myprog on the node. I hope you could understand what I did.
>>>>> 
>>>>> Now, I retried with "-host node03", which still causes the problem:
>>>>> (I comfirmed local run on manage caused the same problem too)
>>>>> 
>>>>> [mishima@manage ~]$ rsh node03
>>>>> Last login: Wed Dec 11 11:38:57 from manage
>>>>> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>> [mishima@node03 demos]$
>>>>> [mishima@node03 demos]$ mpirun -np 8 -host node03 -report-bindings
>>>>> -cpus-per-proc 4 -map-by socket myprog
>>>>> 
>>> 
> --------------------------------------------------------------------------
>>>>> A request was made to bind to that would result in binding more
>>>>> processes than cpus on a resource:
>>>>> 
>>>>> Bind to:         CORE
>>>>> Node:            node03
>>>>> #processes:  2
>>>>> #cpus:          1
>>>>> 
>>>>> You can override this protection by adding the "overload-allowed"
>>>>> option to your binding directive.
>>>>> 
>>> 
> --------------------------------------------------------------------------
>>>>> 
>>>>> It' strange, but I have to report that "-map-by socket:span" worked
>>> well.
>>>>> 
>>>>> [mishima@node03 demos]$ mpirun -np 8 -host node03 -report-bindings
>>>>> -cpus-per-proc 4 -map-by socket:span myprog
>>>>> [node03.cluster:11871] MCW rank 2 bound to socket 1[core 8[hwt 0]],
>>> socket
>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>> ocket 1[core 11[hwt 0]]:
>>>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>> [node03.cluster:11871] MCW rank 3 bound to socket 1[core 12[hwt 0]],
>>> socket
>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>> socket 1[core 15[hwt 0]]:
>>>>> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>> [node03.cluster:11871] MCW rank 4 bound to socket 2[core 16[hwt 0]],
>>> socket
>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>> socket 2[core 19[hwt 0]]:
>>>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>> [node03.cluster:11871] MCW rank 5 bound to socket 2[core 20[hwt 0]],
>>> socket
>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>> socket 2[core 23[hwt 0]]:
>>>>> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>> [node03.cluster:11871] MCW rank 6 bound to socket 3[core 24[hwt 0]],
>>> socket
>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>> socket 3[core 27[hwt 0]]:
>>>>> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>> [node03.cluster:11871] MCW rank 7 bound to socket 3[core 28[hwt 0]],
>>> socket
>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>> socket 3[core 31[hwt 0]]:
>>>>> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>> [node03.cluster:11871] MCW rank 0 bound to socket 0[core 0[hwt 0]],
>>> socket
>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>> cket 0[core 3[hwt 0]]:
>>>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>> [node03.cluster:11871] MCW rank 1 bound to socket 0[core 4[hwt 0]],
>>> socket
>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>> cket 0[core 7[hwt 0]]:
>>>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>> Hello world from process 2 of 8
>>>>> Hello world from process 6 of 8
>>>>> Hello world from process 3 of 8
>>>>> Hello world from process 7 of 8
>>>>> Hello world from process 1 of 8
>>>>> Hello world from process 5 of 8
>>>>> Hello world from process 0 of 8
>>>>> Hello world from process 4 of 8
>>>>> 
>>>>> Regards,
>>>>> Tetsuya Mishima
>>>>> 
>>>>> 
>>>>>> On Dec 10, 2013, at 6:05 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Hi Ralph,
>>>>>>> 
>>>>>>> I tried again with -cpus-per-proc 2 as shown below.
>>>>>>> Here, I found that "-map-by socket:span" worked well.
>>>>>>> 
>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
> -cpus-per-proc
>>> 2
>>>>>>> -map-by socket:span myprog
>>>>>>> [node03.cluster:10879] MCW rank 2 bound to socket 1[core 8[hwt 0]],
>>>>> socket
>>>>>>> 1[core 9[hwt 0]]: [./././././././.][B/B/././.
>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10879] MCW rank 3 bound to socket 1[core 10[hwt
> 0]],
>>>>> socket
>>>>>>> 1[core 11[hwt 0]]: [./././././././.][././B/B
>>>>>>> /./././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10879] MCW rank 4 bound to socket 2[core 16[hwt
> 0]],
>>>>> socket
>>>>>>> 2[core 17[hwt 0]]: [./././././././.][./././.
>>>>>>> /./././.][B/B/./././././.][./././././././.]
>>>>>>> [node03.cluster:10879] MCW rank 5 bound to socket 2[core 18[hwt
> 0]],
>>>>> socket
>>>>>>> 2[core 19[hwt 0]]: [./././././././.][./././.
>>>>>>> /./././.][././B/B/./././.][./././././././.]
>>>>>>> [node03.cluster:10879] MCW rank 6 bound to socket 3[core 24[hwt
> 0]],
>>>>> socket
>>>>>>> 3[core 25[hwt 0]]: [./././././././.][./././.
>>>>>>> /./././.][./././././././.][B/B/./././././.]
>>>>>>> [node03.cluster:10879] MCW rank 7 bound to socket 3[core 26[hwt
> 0]],
>>>>> socket
>>>>>>> 3[core 27[hwt 0]]: [./././././././.][./././.
>>>>>>> /./././.][./././././././.][././B/B/./././.]
>>>>>>> [node03.cluster:10879] MCW rank 0 bound to socket 0[core 0[hwt 0]],
>>>>> socket
>>>>>>> 0[core 1[hwt 0]]: [B/B/./././././.][././././.
>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10879] MCW rank 1 bound to socket 0[core 2[hwt 0]],
>>>>> socket
>>>>>>> 0[core 3[hwt 0]]: [././B/B/./././.][././././.
>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>> Hello world from process 1 of 8
>>>>>>> Hello world from process 0 of 8
>>>>>>> Hello world from process 4 of 8
>>>>>>> Hello world from process 2 of 8
>>>>>>> Hello world from process 7 of 8
>>>>>>> Hello world from process 6 of 8
>>>>>>> Hello world from process 5 of 8
>>>>>>> Hello world from process 3 of 8
>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
> -cpus-per-proc
>>> 2
>>>>>>> -map-by socket myprog
>>>>>>> [node03.cluster:10921] MCW rank 2 bound to socket 0[core 4[hwt 0]],
>>>>> socket
>>>>>>> 0[core 5[hwt 0]]: [././././B/B/./.][././././.
>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10921] MCW rank 3 bound to socket 0[core 6[hwt 0]],
>>>>> socket
>>>>>>> 0[core 7[hwt 0]]: [././././././B/B][././././.
>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10921] MCW rank 4 bound to socket 1[core 8[hwt 0]],
>>>>> socket
>>>>>>> 1[core 9[hwt 0]]: [./././././././.][B/B/././.
>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10921] MCW rank 5 bound to socket 1[core 10[hwt
> 0]],
>>>>> socket
>>>>>>> 1[core 11[hwt 0]]: [./././././././.][././B/B
>>>>>>> /./././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10921] MCW rank 6 bound to socket 1[core 12[hwt
> 0]],
>>>>> socket
>>>>>>> 1[core 13[hwt 0]]: [./././././././.][./././.
>>>>>>> /B/B/./.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10921] MCW rank 7 bound to socket 1[core 14[hwt
> 0]],
>>>>> socket
>>>>>>> 1[core 15[hwt 0]]: [./././././././.][./././.
>>>>>>> /././B/B][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10921] MCW rank 0 bound to socket 0[core 0[hwt 0]],
>>>>> socket
>>>>>>> 0[core 1[hwt 0]]: [B/B/./././././.][././././.
>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:10921] MCW rank 1 bound to socket 0[core 2[hwt 0]],
>>>>> socket
>>>>>>> 0[core 3[hwt 0]]: [././B/B/./././.][././././.
>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>> Hello world from process 5 of 8
>>>>>>> Hello world from process 1 of 8
>>>>>>> Hello world from process 6 of 8
>>>>>>> Hello world from process 4 of 8
>>>>>>> Hello world from process 2 of 8
>>>>>>> Hello world from process 0 of 8
>>>>>>> Hello world from process 7 of 8
>>>>>>> Hello world from process 3 of 8
>>>>>>> 
>>>>>>> "-np 8" and "-cpus-per-proc 4" just filled all sockets.
>>>>>>> In this case, I guess "-map-by socket:span" and "-map-by socket"
> has
>>>>> same
>>>>>>> meaning.
>>>>>>> Therefore, there's no problem about that. Sorry for distubing.
>>>>>> 
>>>>>> No problem - glad you could clear that up :-)
>>>>>> 
>>>>>>> 
>>>>>>> By the way, through this test, I found another problem.
>>>>>>> Without torque manager and just using rsh, it causes the same error
>>>>> like
>>>>>>> below:
>>>>>>> 
>>>>>>> [mishima@manage openmpi-1.7]$ rsh node03
>>>>>>> Last login: Wed Dec 11 09:42:02 from manage
>>>>>>> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
> -cpus-per-proc
>>> 4
>>>>>>> -map-by socket myprog
>>>>>> 
>>>>>> I don't understand the difference here - you are simply starting it
>>> from
>>>>> a different node? It looks like everything is expected to run local
> to
>>>>> mpirun, yes? So there is no rsh actually involved here.
>>>>>> Are you still running in an allocation?
>>>>>> 
>>>>>> If you run this with "-host node03" on the cmd line, do you see the
>>> same
>>>>> problem?
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>> 
>>> 
> --------------------------------------------------------------------------
>>>>>>> A request was made to bind to that would result in binding more
>>>>>>> processes than cpus on a resource:
>>>>>>> 
>>>>>>> Bind to:         CORE
>>>>>>> Node:            node03
>>>>>>> #processes:  2
>>>>>>> #cpus:          1
>>>>>>> 
>>>>>>> You can override this protection by adding the "overload-allowed"
>>>>>>> option to your binding directive.
>>>>>>> 
>>>>> 
>>> 
> --------------------------------------------------------------------------
>>>>>>> [mishima@node03 demos]$
>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
> -cpus-per-proc
>>> 4
>>>>>>> myprog
>>>>>>> [node03.cluster:11036] MCW rank 2 bound to socket 1[core 8[hwt 0]],
>>>>> socket
>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>> 
> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:11036] MCW rank 3 bound to socket 1[core 12[hwt
> 0]],
>>>>> socket
>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>> 
> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>> [node03.cluster:11036] MCW rank 4 bound to socket 2[core 16[hwt
> 0]],
>>>>> socket
>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>> 
> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>> [node03.cluster:11036] MCW rank 5 bound to socket 2[core 20[hwt
> 0]],
>>>>> socket
>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>> 
> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>> [node03.cluster:11036] MCW rank 6 bound to socket 3[core 24[hwt
> 0]],
>>>>> socket
>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>> socket 3[core 27[hwt 0]]:>>>>>
> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>> [node03.cluster:11036] MCW rank 7 bound to socket 3[core 28[hwt
> 0]],
>>>>> socket
>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>> 
> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>>>> [node03.cluster:11036] MCW rank 0 bound to socket 0[core 0[hwt 0]],
>>>>> socket
>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>> 
> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>> [node03.cluster:11036] MCW rank 1 bound to socket 0[core 4[hwt 0]],
>>>>> socket
>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>> 
> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>> Hello world from process 4 of 8
>>>>>>> Hello world from process 2 of 8
>>>>>>> Hello world from process 6 of 8
>>>>>>> Hello world from process 5 of 8
>>>>>>> Hello world from process 3 of 8
>>>>>>> Hello world from process 7 of 8
>>>>>>> Hello world from process 0 of 8
>>>>>>> Hello world from process 1 of 8
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Tetsuya Mishima
>>>>>>> 
>>>>>>>> Hmmm...that's strange. I only have 2 sockets on my system, but let
>>> me
>>>>>>> poke around a bit and see what might be happening.
>>>>>>>> 
>>>>>>>> On Dec 10, 2013, at 4:47 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Ralph,
>>>>>>>>> 
>>>>>>>>> Thanks. I didn't know the meaning of "socket:span".
>>>>>>>>> 
>>>>>>>>> But it still causes the problem, which seems socket:span doesn't
>>>>> work.
>>>>>>>>> 
>>>>>>>>> [mishima@manage demos]$ qsub -I -l nodes=node03:ppn=32
>>>>>>>>> qsub: waiting for job 8265.manage.cluster to start
>>>>>>>>> qsub: job 8265.manage.cluster ready
>>>>>>>>> 
>>>>>>>>> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
>>> -cpus-per-proc
>>>>> 4
>>>>>>>>> -map-by socket:span myprog
>>>>>>>>> [node03.cluster:10262] MCW rank 2 bound to socket 1[core 8[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>>>> 
>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:10262] MCW rank 3 bound to socket 1[core 12[hwt
>>> 0]],
>>>>>>> socket
>>>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>>>> 
>>> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:10262] MCW rank 4 bound to socket 2[core 16[hwt
>>> 0]],
>>>>>>> socket
>>>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>>>> 
>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>> [node03.cluster:10262] MCW rank 5 bound to socket 2[core 20[hwt
>>> 0]],
>>>>>>> socket
>>>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>>>> 
>>> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>>>> [node03.cluster:10262] MCW rank 6 bound to socket 3[core 24[hwt
>>> 0]],
>>>>>>> socket
>>>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>>>> socket 3[core 27[hwt 0]]:
>>>>>>>>> 
>>> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>>>> [node03.cluster:10262] MCW rank 7 bound to socket 3[core 28[hwt
>>> 0]],
>>>>>>> socket
>>>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>>>> 
>>> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>>>>>> [node03.cluster:10262] MCW rank 0 bound to socket 0[core 0[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>> 
>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:10262] MCW rank 1 bound to socket 0[core 4[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>> 
>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>> Hello world from process 0 of 8
>>>>>>>>> Hello world from process 3 of 8
>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>>> No, that is actually correct. We map a socket until full, then
>>> move
>>>>> to
>>>>>>>>> the next. What you want is --map-by socket:span
>>>>>>>>>> 
>>>>>>>>>> On Dec 10, 2013, at 3:42 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>> 
>>>>>>>>>>> I had a time to try your patch yesterday using
>>>>> openmpi-1.7.4a1r29646.
>>>>>>>>>>>>>>>>>> It stopped the error but unfortunately "mapping by
>>> socket" itself
>>>>>>>>> didn't
>>>>>>>>>>> work
>>>>>>>>>>> well as shown bellow:
>>>>>>>>>>> 
>>>>>>>>>>> [mishima@manage demos]$ qsub -I -l nodes=1:ppn=32
>>>>>>>>>>> qsub: waiting for job 8260.manage.cluster to start
>>>>>>>>>>> qsub: job 8260.manage.cluster ready
>>>>>>>>>>> 
>>>>>>>>>>> [mishima@node04 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>>>>>> [mishima@node04 demos]$ mpirun -np 8 -report-bindings
>>>>> -cpus-per-proc
>>>>>>> 4
>>>>>>>>>>> -map-by socket myprog
>>>>>>>>>>> [node04.cluster:27489] MCW rank 2 bound to socket 1[core 8[hwt
>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>> [node04.cluster:27489] MCW rank 3 bound to socket 1[core 12[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>>>>>> [node04.cluster:27489] MCW rank 4 bound to socket 2[core 16[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>> [node04.cluster:27489] MCW rank 5 bound to socket 2[core 20[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>>>>>> [node04.cluster:27489] MCW rank 6 bound to socket 3[core 24[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>>>>>> socket 3[core 27[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>>>>>> [node04.cluster:27489] MCW rank 7 bound to socket 3[core 28[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>>>>>>>> [node04.cluster:27489] MCW rank 0 bound to socket 0[core 0[hwt
>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>>>> 
>>>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>> [node04.cluster:27489] MCW rank 1 bound to socket 0[core 4[hwt
>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>>>> 
>>>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>>>> Hello world from process 3 of 8
>>>>>>>>>>> Hello world from process 0 of 8
>>>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>>>> 
>>>>>>>>>>> I think this should be like this:
>>>>>>>>>>> 
>>>>>>>>>>> rank 00
>>>>>>>>>>> 
>>>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>> rank 01
>>>>>>>>>>> 
>>>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>> rank 02
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>> ...
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>> 
>>>>>>>>>>>> I fixed this under the trunk (was an issue regardless of RM)
> and
>>>>>>> have
>>>>>>>>>>> scheduled it for 1.7.4.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>> Ralph
>>>>>>>>>>>> 
>>>>>>>>>>>> On Nov 25, 2013, at 4:22 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thank you very much for your quick response.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm afraid to say that I found one more issuse...
>>>>>>>>>>>>> 
>>>>>>>>>>>>> It's not so serious. Please check it when you have a lot of
>>> time.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The problem is cpus-per-proc with -map-by option under Torque
>>>>>>>>> manager.
>>>>>>>>>>>>> It doesn't work as shown below. I guess you can get the same
>>>>>>>>>>>>> behaviour under Slurm manager.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Of course, if I remove -map-by option, it works quite well.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [mishima@manage testbed2]$ qsub -I -l nodes=1:ppn=32
>>>>>>>>>>>>> qsub: waiting for job 8116.manage.cluster to start
>>>>>>>>>>>>> qsub: job 8116.manage.cluster ready
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [mishima@node03 ~]$ cd ~/Ducom/testbed2
>>>>>>>>>>>>> [mishima@node03 testbed2]$ mpirun -np 8 -report-bindings
>>>>>>>>> -cpus-per-proc
>>>>>>>>>>> 4
>>>>>>>>>>>>> -map-by socket mPre
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
> --------------------------------------------------------------------------
>>>>>>>>>>>>> A request was made to bind to that would result in binding
> more
>>>>>>>>>>>>> processes than cpus on a resource:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Bind to:         CORE
>>>>>>>>>>>>> Node:            node03>>>>>>> #processes:  2
>>>>>>>>>>>>> #cpus:          1
>>>>>>>>>>>>> 
>>>>>>>>>>>>> You can override this protection by adding the
>>> "overload-allowed"
>>>>>>>>>>>>> option to your binding directive.
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
> --------------------------------------------------------------------------
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [mishima@node03 testbed2]$ mpirun -np 8 -report-bindings
>>>>>>>>> -cpus-per-proc
>>>>>>>>>>> 4
>>>>>>>>>>>>> mPre
>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 2 bound to socket 1[core 8
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 3 bound to socket 1[core 12
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 4 bound to socket 2[core 16
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 5 bound to socket 2[core 20
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 6 bound to socket 3[core 24
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>>>>>>>> socket 3[core 27[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 7 bound to socket 3[core 28
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 0 bound to socket 0[core 0
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 1 bound to socket 0[core 4
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Fixed and scheduled to move to 1.7.4. Thanks again!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Nov 17, 2013, at 6:11 PM, Ralph Castain
> <r...@open-mpi.org>
>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks! That's precisely where I was going to look when I
> had
>>>>>>>>> time :-)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'll update tomorrow.
>>>>>>>>>>>>>> Ralph
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Sun, Nov 17, 2013 at 7:01 PM,
>>>>>>> <tmish...@jcity.maeda.co.jp>wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This is the continuous story of "Segmentation fault in
>>> oob_tcp.c
>>>>>>> of
>>>>>>>>>>>>>> openmpi-1.7.4a1r29646".
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I found the cause.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Firstly, I noticed that your hostfile can work and mine can
>>> not.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Your host file:
>>>>>>>>>>>>>> cat hosts
>>>>>>>>>>>>>> bend001 slots=12
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> My host file:
>>>>>>>>>>>>>> cat hosts
>>>>>>>>>>>>>> node08
>>>>>>>>>>>>>> node08
>>>>>>>>>>>>>> ...(total 8 lines)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I modified my script file to add "slots=1" to each line of
> my
>>>>>>>>> hostfile
>>>>>>>>>>>>>> just before launching mpirun. Then it worked.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> My host file(modified):
>>>>>>>>>>>>>> cat hosts
>>>>>>>>>>>>>> node08 slots=1
>>>>>>>>>>>>>> node08 slots=1
>>>>>>>>>>>>>> ...(total 8 lines)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Secondary, I confirmed that there's a slight difference
>>> between
>>>>>>>>>>>>>> orte/util/hostfile/hostfile.c of 1.7.3 and that of
>>>>> 1.7.4a1r29646.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> $ diff
>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>>> 
> hostfile.c.org ../../../../openmpi-1.7.3/orte/util/hostfile/hostfile.c
>>>>>>>>>>>>>> 394,401c394,399
>>>>>>>>>>>>>> <     if (got_count) {
>>>>>>>>>>>>>> <         node->slots_given = true;
>>>>>>>>>>>>>> <     } else if (got_max) {
>>>>>>>>>>>>>> <         node->slots = node->slots_max;
>>>>>>>>>>>>>> <         node->slots_given = true;
>>>>>>>>>>>>>> <     } else {
>>>>>>>>>>>>>> <         /* should be set by obj_new, but just to be clear
> */
>>>>>>>>>>>>>> <         node->slots_given = false;
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>> if (!got_count) {
>>>>>>>>>>>>>>>   if (got_max) {
>>>>>>>>>>>>>>>       node->slots = node->slots_max;
>>>>>>>>>>>>>>>   } else {
>>>>>>>>>>>>>>>       ++node->slots;>>>>>>>>>>>>>    }
>>>>>>>>>>>>>> ....
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Finally, I added the line 402 below just as a tentative
> trial.
>>>>>>>>>>>>>> Then, it worked.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> cat -n orte/util/hostfile/hostfile.c:
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> 394      if (got_count) {
>>>>>>>>>>>>>> 395          node->slots_given = true;
>>>>>>>>>>>>>> 396      } else if (got_max) {
>>>>>>>>>>>>>> 397          node->slots = node->slots_max;
>>>>>>>>>>>>>> 398          node->slots_given = true;
>>>>>>>>>>>>>> 399      } else {
>>>>>>>>>>>>>> 400          /* should be set by obj_new, but just to be
> clear
>>>>> */
>>>>>>>>>>>>>> 401          node->slots_given
> = false;
>>>>>>>>>>>>>> 402          ++node->slots; /* added by tmishima */
>>>>>>>>>>>>>> 403      }
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Please fix the problem properly, because it's just based on
> my
>>>>>>>>>>>>>> random guess. It's related to the treatment of hostfile
> where
>>>>>>> slots
>>>>>>>>>>>>>> information is not given.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
> http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________
> 
>>> 
>>>>> 
>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>> 
>>>>>>> 
> users@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> users mailing list
>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>> 
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager

Reply via email to