Ralph,

i can simply reproduce the issue with two nodes and the latest master

all commands are ran on n1, which has the same topology (2 sockets * 8 cores each) than n2


1) everything works

$ mpirun -np 16 -bind-to core --report-bindings true
[n1:29794] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././.][./././././././.] [n1:29794] MCW rank 1 bound to socket 1[core 8[hwt 0]]: [./././././././.][B/././././././.] [n1:29794] MCW rank 2 bound to socket 0[core 1[hwt 0]]: [./B/./././././.][./././././././.] [n1:29794] MCW rank 3 bound to socket 1[core 9[hwt 0]]: [./././././././.][./B/./././././.] [n1:29794] MCW rank 4 bound to socket 0[core 2[hwt 0]]: [././B/././././.][./././././././.] [n1:29794] MCW rank 5 bound to socket 1[core 10[hwt 0]]: [./././././././.][././B/././././.] [n1:29794] MCW rank 6 bound to socket 0[core 3[hwt 0]]: [./././B/./././.][./././././././.] [n1:29794] MCW rank 7 bound to socket 1[core 11[hwt 0]]: [./././././././.][./././B/./././.] [n1:29794] MCW rank 8 bound to socket 0[core 4[hwt 0]]: [././././B/././.][./././././././.] [n1:29794] MCW rank 9 bound to socket 1[core 12[hwt 0]]: [./././././././.][././././B/././.] [n1:29794] MCW rank 10 bound to socket 0[core 5[hwt 0]]: [./././././B/./.][./././././././.] [n1:29794] MCW rank 11 bound to socket 1[core 13[hwt 0]]: [./././././././.][./././././B/./.] [n1:29794] MCW rank 12 bound to socket 0[core 6[hwt 0]]: [././././././B/.][./././././././.] [n1:29794] MCW rank 13 bound to socket 1[core 14[hwt 0]]: [./././././././.][././././././B/.] [n1:29794] MCW rank 14 bound to socket 0[core 7[hwt 0]]: [./././././././B][./././././././.] [n1:29794] MCW rank 15 bound to socket 1[core 15[hwt 0]]: [./././././././.][./././././././B]

$ mpirun -np 16 -bind-to core --host n1:16 --report-bindings true
[n1:29850] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././.][./././././././.] [n1:29850] MCW rank 1 bound to socket 1[core 8[hwt 0]]: [./././././././.][B/././././././.] [n1:29850] MCW rank 2 bound to socket 0[core 1[hwt 0]]: [./B/./././././.][./././././././.] [n1:29850] MCW rank 3 bound to socket 1[core 9[hwt 0]]: [./././././././.][./B/./././././.] [n1:29850] MCW rank 4 bound to socket 0[core 2[hwt 0]]: [././B/././././.][./././././././.] [n1:29850] MCW rank 5 bound to socket 1[core 10[hwt 0]]: [./././././././.][././B/././././.] [n1:29850] MCW rank 6 bound to socket 0[core 3[hwt 0]]: [./././B/./././.][./././././././.] [n1:29850] MCW rank 7 bound to socket 1[core 11[hwt 0]]: [./././././././.][./././B/./././.] [n1:29850] MCW rank 8 bound to socket 0[core 4[hwt 0]]: [././././B/././.][./././././././.] [n1:29850] MCW rank 9 bound to socket 1[core 12[hwt 0]]: [./././././././.][././././B/././.] [n1:29850] MCW rank 10 bound to socket 0[core 5[hwt 0]]: [./././././B/./.][./././././././.] [n1:29850] MCW rank 11 bound to socket 1[core 13[hwt 0]]: [./././././././.][./././././B/./.] [n1:29850] MCW rank 12 bound to socket 0[core 6[hwt 0]]: [././././././B/.][./././././././.] [n1:29850] MCW rank 13 bound to socket 1[core 14[hwt 0]]: [./././././././.][././././././B/.] [n1:29850] MCW rank 14 bound to socket 0[core 7[hwt 0]]: [./././././././B][./././././././.] [n1:29850] MCW rank 15 bound to socket 1[core 15[hwt 0]]: [./././././././.][./././././././B]

2) with an other node

$ mpirun -np 16 -bind-to core --host n2:16 --report-bindings true

/* no output with a non MPI app !*/

$ mpirun -np 16 -bind-to core --host n2:16 --report-bindings ./hello_c
[n2:52851] MCW rank 0 not bound
[n2:52852] MCW rank 1 not bound
[n2:52853] MCW rank 2 not bound
[n2:52854] MCW rank 3 not bound
[n2:52855] MCW rank 4 not bound
[n2:52856] MCW rank 5 not bound
[n2:52857] MCW rank 6 not bound
[n2:52859] MCW rank 7 not bound
[n2:52861] MCW rank 8 not bound
[n2:52864] MCW rank 9 not bound
[n2:52866] MCW rank 10 not bound
[n2:52869] MCW rank 11 not bound
[n2:52877] MCW rank 15 not bound
[n2:52871] MCW rank 12 not bound
[n2:52873] MCW rank 13 not bound
[n2:52876] MCW rank 14 not bound
Hello, world, I am 0 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 1 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 2 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 3 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 4 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 5 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 6 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 7 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 8 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 9 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 10 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 11 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 12 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 13 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 14 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165) Hello, world, I am 15 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased developer copy, 165)

/* binding is reported with a MPI app, but no binding has been performed */

3) workaround : use -map-by core (works even with a non MPI app)

$ mpirun -np 16 -bind-to core -map-by core --host n2:16 --report-bindings true [n2:52982] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././.][./././././././.] [n2:52982] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./././././.][./././././././.] [n2:52982] MCW rank 2 bound to socket 0[core 2[hwt 0]]: [././B/././././.][./././././././.] [n2:52982] MCW rank 3 bound to socket 0[core 3[hwt 0]]: [./././B/./././.][./././././././.] [n2:52982] MCW rank 4 bound to socket 0[core 4[hwt 0]]: [././././B/././.][./././././././.] [n2:52982] MCW rank 5 bound to socket 0[core 5[hwt 0]]: [./././././B/./.][./././././././.] [n2:52982] MCW rank 6 bound to socket 0[core 6[hwt 0]]: [././././././B/.][./././././././.] [n2:52982] MCW rank 7 bound to socket 0[core 7[hwt 0]]: [./././././././B][./././././././.] [n2:52982] MCW rank 8 bound to socket 1[core 8[hwt 0]]: [./././././././.][B/././././././.] [n2:52982] MCW rank 9 bound to socket 1[core 9[hwt 0]]: [./././././././.][./B/./././././.] [n2:52982] MCW rank 10 bound to socket 1[core 10[hwt 0]]: [./././././././.][././B/././././.] [n2:52982] MCW rank 11 bound to socket 1[core 11[hwt 0]]: [./././././././.][./././B/./././.] [n2:52982] MCW rank 12 bound to socket 1[core 12[hwt 0]]: [./././././././.][././././B/././.] [n2:52982] MCW rank 13 bound to socket 1[core 13[hwt 0]]: [./././././././.][./././././B/./.] [n2:52982] MCW rank 14 bound to socket 1[core 14[hwt 0]]: [./././././././.][././././././B/.] [n2:52982] MCW rank 15 bound to socket 1[core 15[hwt 0]]: [./././././././.][./././././././B]


note if both nodes are used, binding is just fine

$ mpirun -np 32 -bind-to core --host n1:16,n2:16 --report-bindings true
[n1:30008] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././.][./././././././.] [n1:30008] MCW rank 1 bound to socket 1[core 8[hwt 0]]: [./././././././.][B/././././././.] [n1:30008] MCW rank 2 bound to socket 0[core 1[hwt 0]]: [./B/./././././.][./././././././.] [n1:30008] MCW rank 3 bound to socket 1[core 9[hwt 0]]: [./././././././.][./B/./././././.] [n1:30008] MCW rank 4 bound to socket 0[core 2[hwt 0]]: [././B/././././.][./././././././.] [n1:30008] MCW rank 5 bound to socket 1[core 10[hwt 0]]: [./././././././.][././B/././././.] [n1:30008] MCW rank 6 bound to socket 0[core 3[hwt 0]]: [./././B/./././.][./././././././.] [n1:30008] MCW rank 7 bound to socket 1[core 11[hwt 0]]: [./././././././.][./././B/./././.] [n1:30008] MCW rank 8 bound to socket 0[core 4[hwt 0]]: [././././B/././.][./././././././.] [n1:30008] MCW rank 9 bound to socket 1[core 12[hwt 0]]: [./././././././.][././././B/././.] [n1:30008] MCW rank 10 bound to socket 0[core 5[hwt 0]]: [./././././B/./.][./././././././.] [n1:30008] MCW rank 11 bound to socket 1[core 13[hwt 0]]: [./././././././.][./././././B/./.] [n1:30008] MCW rank 12 bound to socket 0[core 6[hwt 0]]: [././././././B/.][./././././././.] [n1:30008] MCW rank 13 bound to socket 1[core 14[hwt 0]]: [./././././././.][././././././B/.] [n1:30008] MCW rank 14 bound to socket 0[core 7[hwt 0]]: [./././././././B][./././././././.] [n1:30008] MCW rank 15 bound to socket 1[core 15[hwt 0]]: [./././././././.][./././././././B] [n2:53187] MCW rank 16 bound to socket 0[core 0[hwt 0]]: [B/././././././.][./././././././.] [n2:53187] MCW rank 17 bound to socket 1[core 8[hwt 0]]: [./././././././.][B/././././././.] [n2:53187] MCW rank 18 bound to socket 0[core 1[hwt 0]]: [./B/./././././.][./././././././.] [n2:53187] MCW rank 19 bound to socket 1[core 9[hwt 0]]: [./././././././.][./B/./././././.] [n2:53187] MCW rank 20 bound to socket 0[core 2[hwt 0]]: [././B/././././.][./././././././.] [n2:53187] MCW rank 21 bound to socket 1[core 10[hwt 0]]: [./././././././.][././B/././././.] [n2:53187] MCW rank 22 bound to socket 0[core 3[hwt 0]]: [./././B/./././.][./././././././.] [n2:53187] MCW rank 23 bound to socket 1[core 11[hwt 0]]: [./././././././.][./././B/./././.] [n2:53187] MCW rank 24 bound to socket 0[core 4[hwt 0]]: [././././B/././.][./././././././.] [n2:53187] MCW rank 25 bound to socket 1[core 12[hwt 0]]: [./././././././.][././././B/././.] [n2:53187] MCW rank 26 bound to socket 0[core 5[hwt 0]]: [./././././B/./.][./././././././.] [n2:53187] MCW rank 27 bound to socket 1[core 13[hwt 0]]: [./././././././.][./././././B/./.] [n2:53187] MCW rank 28 bound to socket 0[core 6[hwt 0]]: [././././././B/.][./././././././.] [n2:53187] MCW rank 29 bound to socket 1[core 14[hwt 0]]: [./././././././.][././././././B/.] [n2:53187] MCW rank 30 bound to socket 0[core 7[hwt 0]]: [./././././././B][./././././././.] [n2:53187] MCW rank 31 bound to socket 1[core 15[hwt 0]]: [./././././././.][./././././././B]


Cheers,


Gilles


On 4/14/2017 12:43 AM, r...@open-mpi.org wrote:
All right, let’s replace rmaps_base_verbose with odls_base_verbose and see what 
that saids

On Apr 13, 2017, at 8:34 AM, Cyril Bordage <cyril.bord...@inria.fr> wrote:

'-report-bindings' does that.
I used this option because the ranks did not seem to be binded (if I use
a rank file the performace is far better).

Le 13/04/2017 à 17:24, r...@open-mpi.org a écrit :
Okay, so as far as OMPI is concerned, it correctly bound everyone! So how are 
you generating this output claiming it isn’t bound?

On Apr 13, 2017, at 7:57 AM, Cyril Bordage <cyril.bord...@inria.fr> wrote:

devel11:17550] [[29888,0],0] rmaps:base set policy with NULL device NONNULL
[devel11:17550] mca:rmaps:select: checking available component mindist
[devel11:17550] mca:rmaps:select: Querying component [mindist]
[devel11:17550] mca:rmaps:select: checking available component ppr
[devel11:17550] mca:rmaps:select: Querying component [ppr]
[devel11:17550] mca:rmaps:select: checking available component rank_file
[devel11:17550] mca:rmaps:select: Querying component [rank_file]
[devel11:17550] mca:rmaps:select: checking available component resilient
[devel11:17550] mca:rmaps:select: Querying component [resilient]
[devel11:17550] mca:rmaps:select: checking available component round_robin
[devel11:17550] mca:rmaps:select: Querying component [round_robin]
[devel11:17550] mca:rmaps:select: checking available component seq
[devel11:17550] mca:rmaps:select: Querying component [seq]
[devel11:17550] [[29888,0],0]: Final mapper priorities
[devel11:17550]         Mapper: ppr Priority: 90
[devel11:17550]         Mapper: seq Priority: 60
[devel11:17550]         Mapper: resilient Priority: 40
[devel11:17550]         Mapper: mindist Priority: 20
[devel11:17550]         Mapper: round_robin Priority: 10
[devel11:17550]         Mapper: rank_file Priority: 0
[miriel025:62329] [[29888,0],1] rmaps:base set policy with NULL device
NONNULL
[miriel025:62329] mca:rmaps:select: checking available component mindist
[miriel025:62329] mca:rmaps:select: Querying component [mindist]
[miriel025:62329] mca:rmaps:select: checking available component ppr
[miriel025:62329] mca:rmaps:select: Querying component [ppr]
[miriel025:62329] mca:rmaps:select: checking available component rank_file
[miriel025:62329] mca:rmaps:select: Querying component [rank_file]
[miriel026:165125] [[29888,0],2] rmaps:base set policy with NULL device
NONNULL
[miriel026:165125] mca:rmaps:select: checking available component mindist
[miriel026:165125] mca:rmaps:select: Querying component [mindist]
[miriel026:165125] mca:rmaps:select: checking available component ppr
[miriel026:165125] mca:rmaps:select: Querying component [ppr]
[miriel026:165125] mca:rmaps:select: checking available component rank_file
[miriel026:165125] mca:rmaps:select: Querying component [rank_file]
[miriel026:165125] mca:rmaps:select: checking available component resilient
[miriel026:165125] mca:rmaps:select: Querying component [resilient]
[miriel026:165125] mca:rmaps:select: checking available component
round_robin
[miriel026:165125] mca:rmaps:select: Querying component [round_robin]
[miriel026:165125] mca:rmaps:select: checking available component seq
[miriel026:165125] mca:rmaps:select: Querying component [seq]
[miriel026:165125] [[29888,0],2]: Final mapper priorities
[miriel026:165125]      Mapper: ppr Priority: 90
[[devel11:17550] mca:rmaps: mapping job [29888,1]
[devel11:17550] mca:rmaps: setting mapping policies for job [29888,1]
nprocs 48
[devel11:17550] mca:rmaps[169] mapping not set by user - using bynuma
[devel11:17550] mca:rmaps:ppr: job [29888,1] not using ppr mapper PPR
NULL policy PPR NOTSET
[devel11:17550] [[29888,0],0] rmaps:seq called on job [29888,1]
[devel11:17550] mca:rmaps:seq: job [29888,1] not using seq mapper
[devel11:17550] mca:rmaps:resilient: cannot perform initial map of job
[29888,1] - no fault groups
[devel11:17550] mca:rmaps:mindist: job [29888,1] not using mindist mapper
[devel11:17550] mca:rmaps:rr: mapping job [29888,1]
[devel11:17550] [[29888,0],0] Starting with 2 nodes in list
[devel11:17550] [[29888,0],0] Filtering thru apps
[miriel025:62329] mca:rmaps:select: checking available component resilient
[miriel025:62329] mca:rmaps:select: Querying component [resilient]
[miriel025:62329] mca:rmaps:select: checking available component round_robin
[miriel025:62329] mca:rmaps:select: Querying component [round_robin]
[miriel025:62329] mca:rmaps:select: checking available component seq
[miriel025:62329] mca:rmaps:select: Querying component [seq]
[miriel025:62329] [[29888,0],1]: Final mapper priorities
[miriel025:62329]       Mapper: ppr Priority: 90
[miriel025:62329]       Mapper: seq Priority: 60
[miriel025:62329]       Mapper: resilient Priority: 40
[miriel025:62329]       Mapper: mindist Priority: 20
[miriel025:62329]       Mapper: round_robin Priority: 10
[miriel025:62329]       Mapper: rank_file Priority: 0
[miriel025:62329] mca:rmaps: mapping job [29888,1]
[miriel025:62329] mca:rmaps: setting mapping policies for job [29888,1]
nprocs 48
[miriel025:62329] mca:rmaps[169] mapping not set by user - using bynuma
[miriel025:62329] mca:rmaps:ppr: job [29888,1] not using ppr mapper PPR
NULL policy PPR NOTSET
[miriel025:62329] [[29888,0],1] rmaps:seq called on job [29888,1]
[miriel025:62329] mca:rmaps:seq: job [29888,1] not using seq mapper
[miriel025:62329] mca:rmaps:resilient: cannot perform initial map of job
[29888,1] - no fault groups
[miriel025:62329] mca:rmaps:mindist: job [29888,1] not using mindist mapper
[miriel025:62329] mca:rmaps:rr: mapping job [29888,1]
[miriel025:62329] [[29888,0],1] Starting with 2 nodes in list
[miriel025:62329] [[29888,0],1] Filtering thru apps
[devel11:17550] [[29888,0],0] Retained 2 nodes in list
[devel11:17550] [[29888,0],0] node miriel025 has 24 slots available
[devel11:17550] [[29888,0],0] node miriel026 has 24 slots available
[devel11:17550] AVAILABLE NODES FOR MAPPING:
[devel11:17550]     node: miriel025 daemon: 1
[devel11:17550]     node: miriel026 daemon: 2
[devel11:17550] [[29888,0],0] Starting bookmark at node miriel025
[devel11:17550] [[29888,0],0] Starting at node miriel025
[devel11:17550] mca:rmaps:rr: mapping no-span by NUMANode for job
[29888,1] slots 48 num_procs 48
[devel11:17550] mca:rmaps:rr: found 4 NUMANode objects on node miriel025
[devel11:17550] mca:rmaps:rr: calculated nprocs 24
[devel11:17550] mca:rmaps:rr: assigning nprocs 24
[devel11:17550] mca:rmaps:rr: found 4 NUMANode objects on node miriel026
[devel11:17550] mca:rmaps:rr: calculated nprocs 24
[devel11:17550] mca:rmaps:rr: assigning nprocs 24
[devel11:17550] mca:rmaps:base: computing vpids by slot for job [29888,1]
[devel11:17550] mca:rmaps:base: assigning rank 0 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 1 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 2 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 3 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 4 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 5 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 6 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 7 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 8 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 9 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 10 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 11 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 12 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 13 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 14 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 15 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 16 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 17 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 18 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 19 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 20 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 21 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 22 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 23 to node miriel025
[devel11:17550] mca:rmaps:base: assigning rank 24 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 25 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 26 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 27 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 28 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 29 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 30 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 31 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 32 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 33 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 34 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 35 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 36 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 37 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 38 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 39 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 40 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 41 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 42 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 43 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 44 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 45 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 46 to node miriel026
[devel11:17550] mca:rmaps:base: assigning rank 47 to node miriel026
[devel11:17550] [[29888,0],0] rmaps:base:compute_usage
[devel11:17550] mca:rmaps: compute bindings for job [29888,1] with
policy CORE[4008]
[devel11:17550] [[29888,0],0] bind_depth: 6 map_depth 2
[devel11:17550] mca:rmaps: bind downward for job [29888,1] with bindings
CORE
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],0] BITMAP 0
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],0][miriel025] TO
socket 0[core 0[hwt 0]]: [B/././././././././././.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],1] BITMAP 12
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],1][miriel025] TO
socket 0[core 6[hwt 0]]: [././././././B/././././.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],2] BITMAP 1
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],2][miriel025] TO
socket 1[core 12[hwt 0]]: [./././././././././././.][B/././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],3] BITMAP 13
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],3][miriel025] TO
socket 1[core 18[hwt 0]]: [./././././././././././.][././././././B/././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],4] BITMAP 2
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],4][miriel025] TO
socket 0[core 1[hwt 0]]: [./B/./././././././././.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],5] BITMAP 14
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],5][miriel025] TO
socket 0[core 7[hwt 0]]: [./././././././B/./././.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],6] BITMAP 3
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],6][miriel025] TO
socket 1[core 13[hwt 0]]: [./././././././././././.][./B/./././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],7] BITMAP 15
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],7][miriel025] TO
socket 1[core 19[hwt 0]]: [./././././././././././.][./././././././B/./././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],8] BITMAP 4
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],8][miriel025] TO
socket 0[core 2[hwt 0]]: [././B/././././././././.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],9] BITMAP 16
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],9][miriel025] TO
socket 0[core 8[hwt 0]]: [././././././././B/././.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],10] BITMAP 5
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],10][miriel025] TO
socket 1[core 14[hwt 0]]: [./././././././././././.][././B/././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],11] BITMAP 17
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],11][miriel025] TO
socket 1[core 20[hwt 0]]: [./././././././././././.][././././././././B/././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],12] BITMAP 6
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],12][miriel025] TO
socket 0[core 3[hwt 0]]: [./././B/./././././././.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],13] BITMAP 18
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],13][miriel025] TO
socket 0[core 9[hwt 0]]: [./././././././././B/./.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],14] BITMAP 7
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],14][miriel025] TO
socket 1[core 15[hwt 0]]: [./././././././././././.][./././B/./././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],15] BITMAP 19
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],15][miriel025] TO
socket 1[core 21[hwt 0]]: [./././././././././././.][./././././././././B/./.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],16] BITMAP 8
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],16][miriel025] TO
socket 0[core 4[hwt 0]]: [././././B/././././././.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],17] BITMAP 20
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],17][miriel025] TO
socket 0[core 10[hwt 0]]: [././././././././././B/.][./././././././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],18] BITMAP 9
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],18][miriel025] TO
socket 1[core 16[hwt 0]]: [./././././././././././.][././././B/././././././.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],19] BITMAP 21
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],19][miriel025] TO
socket 1[core 22[hwt 0]]: [./././././././././././.][././././././././././B/.]
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[devel11:17550] [[29888,0],0] PROC [[29888,1],20] BITMAP 10
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],20][miriel025] TO
socket 0[core 5[hwt 0]]: [./././././B/./././././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] Retained 2 nodes in list
[miriel025:62329] [[29888,0],1] node miriel025 has 24 slots available
[miriel025:62329] [[29888,0],1] node miriel026 has 24 slots available
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[miriel025:62329] AVAILABLE NODES FOR MAPPING:
[miriel025:62329]     node: miriel025 daemon: 1
[miriel025:62329]     node: miriel026 daemon: 2
[miriel025:62329] [[29888,0],1] Starting bookmark at node miriel025
[miriel025:62329] [[29888,0],1] Starting at node miriel025
[devel11:17550] [[29888,0],0] PROC [[29888,1],21] BITMAP 22
[miriel025:62329] mca:rmaps:rr: mapping no-span by NUMANode for job
[29888,1] slots 48 num_procs 48
[miriel025:62329] mca:rmaps:rr: found 4 NUMANode objects on node miriel025
[miriel025:62329] mca:rmaps:rr: calculated nprocs 24
[miriel025:62329] mca:rmaps:rr: assigning nprocs 24
[miriel025:62329] mca:rmaps:rr: found 4 NUMANode objects on node miriel026
[miriel025:62329] mca:rmaps:rr: calculated nprocs 24
[miriel025:62329] mca:rmaps:rr: assigning nprocs 24
[miriel025:62329] mca:rmaps:base: computing vpids by slot for job [29888,1]
[miriel025:62329] mca:rmaps:base: assigning rank 0 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 1 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 2 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 3 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 4 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 5 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 6 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 7 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 8 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 9 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 10 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 11 to node miriel025
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],21][miriel025] TO
socket 0[core 11[hwt 0]]: [./././././././././././B][./././././././././././.]
[miriel025:62329] mca:rmaps:base: assigning rank 12 to node miriel025
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[miriel025:62329] mca:rmaps:base: assigning rank 13 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 14 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 15 to node miriel025
[devel11:17550] [[29888,0],0] PROC [[29888,1],22] BITMAP 11
[miriel025:62329] mca:rmaps:base: assigning rank 16 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 17 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 18 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 19 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 20 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 21 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 22 to node miriel025
[miriel025:62329] mca:rmaps:base: assigning rank 23 to node miriel025
miriel026:165125]       Mapper: seq Priority: 60
[miriel026:165125]      Mapper: resilient Priority: 40
[miriel026:165125]      Mapper: mindist Priority: 20
[miriel026:165125]      Mapper: round_robin Priority: 10
[miriel026:165125]      Mapper: rank_file Priority: 0
[miriel025:62329] mca:rmaps:base: assigning rank 24 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 25 to node miriel026
[miriel026:165125] mca:rmaps: mapping job [29888,1]
[miriel026:165125] mca:rmaps: setting mapping policies for job [29888,1]
nprocs 48
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],22][miriel025] TO
socket 1[core 17[hwt 0]]: [./././././././././././.][./././././B/./././././.]
[miriel026:165125] mca:rmaps[169] mapping not set by user - using bynuma
[miriel026:165125] mca:rmaps:ppr: job [29888,1] not using ppr mapper PPR
NULL policy PPR NOTSET
[devel11:17550] [[29888,0],0] GOT 1 CPUS
[miriel026:165125] [[29888,0],2] rmaps:seq called on job [29888,1]
[miriel026:165125] mca:rmaps:seq: job [29888,1] not using seq mapper
[devel11:17550] [[29888,0],0] PROC [[29888,1],23] BITMAP 23
[miriel025:62329] mca:rmaps:base: assigning rank 26 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 27 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 28 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 29 to node miriel026
[miriel026:165125] mca:rmaps:resilient: cannot perform initial map of
job [29888,1] - no fault groups
[miriel026:165125] mca:rmaps:mindist: job [29888,1] not using mindist mapper
[miriel025:62329] mca:rmaps:base: assigning rank 30 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 31 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 32 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 33 to node miriel026
[miriel026:165125] mca:rmaps:rr: mapping job [29888,1]
[miriel026:165125] [[29888,0],2] Starting with 2 nodes in list
[miriel025:62329] mca:rmaps:base: assigning rank 34 to node miriel026
[miriel026:165125] [[29888,0],2] Filtering thru apps
[devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],23][miriel025] TO
socket 1[core 23[hwt 0]]: [./././././././././././.][./././././././././././B]
[miriel025:62329] mca:rmaps:base: assigning rank 35 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 36 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 37 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 38 to node miriel026
[miriel026:165125] [[29888,0],2] Retained 2 nodes in list
[miriel026:165125] [[29888,0],2] node miriel025 has 24 slots available
[miriel026:165125] [[29888,0],2] node miriel026 has 24 slots available
[miriel026:165125] AVAILABLE NODES FOR MAPPING:
[miriel026:165125]     node: miriel025 daemon: 1
[miriel026:165125]     node: miriel026 daemon: 2
[miriel026:165125] [[29888,0],2] Starting bookmark at node miriel025
[miriel026:165125] [[29888,0],2] Starting at node miriel025
[miriel026:165125] mca:rmaps:rr: mapping no-span by NUMANode for job
[29888,1] slots 48 num_procs 48
[miriel026:165125] mca:rmaps:rr: found 4 NUMANode objects on node miriel025
[miriel026:165125] mca:rmaps:rr: calculated nprocs 24
[miriel026:165125] mca:rmaps:rr: assigning nprocs 24
[miriel026:165125] mca:rmaps:rr: found 4 NUMANode objects on node miriel026
[miriel026:165125] mca:rmaps:rr: calculated nprocs 24
[miriel026:165125] mca:rmaps:rr: assigning nprocs 24
[miriel026:165125] mca:rmaps:base: computing vpids by slot for job [29888,1]
[miriel026:165125] mca:rmaps:base: assigning rank 0 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 1 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 2 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 3 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 4 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 5 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 6 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 7 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 8 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 9 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 10 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 11 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 12 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 13 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 14 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 15 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 16 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 17 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 18 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 19 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 20 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 21 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 22 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 23 to node miriel025
[miriel026:165125] mca:rmaps:base: assigning rank 24 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 25 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 26 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 27 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 28 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 29 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 30 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 31 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 32 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 33 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 34 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 35 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 36 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 37 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 38 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 39 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 40 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 41 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 42 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 43 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 44 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 45 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 46 to node miriel026
[miriel026:165125] mca:rmaps:base: assigning rank 47 to node miriel026
[miriel026:165125] [[29888,0],2] rmaps:base:compute_usage
[miriel025:62329] mca:rmaps:base: assigning rank 39 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 40 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 41 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 42 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 43 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 44 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 45 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 46 to node miriel026
[miriel025:62329] mca:rmaps:base: assigning rank 47 to node miriel026
[miriel025:62329] [[29888,0],1] rmaps:base:compute_usage
[miriel025:62329] mca:rmaps: compute bindings for job [29888,1] with
policy CORE[4008]
[miriel025:62329] [[29888,0],1] bind_depth: 6 map_depth 2
[miriel025:62329] mca:rmaps: bind downward for job [29888,1] with
bindings CORE
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],24] BITMAP 0
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],24][miriel026] TO
socket 0[core 0[hwt 0]]: [B/././././././././././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],25] BITMAP 12
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],25][miriel026] TO
socket 0[core 6[hwt 0]]: [././././././B/././././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],26] BITMAP 1
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],26][miriel026] TO
socket 1[core 12[hwt 0]]: [./././././././././././.][B/././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],27] BITMAP 13
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],27][miriel026] TO
socket 1[core 18[hwt 0]]: [./././././././././././.][././././././B/././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],28] BITMAP 2
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],28][miriel026] TO
socket 0[core 1[hwt 0]]: [./B/./././././././././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],29] BITMAP 14
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],29][miriel026] TO
socket 0[core 7[hwt 0]]: [./././././././B/./././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],30] BITMAP 3
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],30][miriel026] TO
socket 1[core 13[hwt 0]]: [./././././././././././.][./B/./././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],31] BITMAP 15
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],31][miriel026] TO
socket 1[core 19[hwt 0]]: [./././././././././././.][./././././././B/./././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],32] BITMAP 4
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],32][miriel026] TO
socket 0[core 2[hwt 0]]: [././B/././././././././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],33] BITMAP 16
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],33][miriel026] TO
socket 0[core 8[hwt 0]]: [././././././././B/././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],34] BITMAP 5
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],34][miriel026] TO
socket 1[core 14[hwt 0]]: [./././././././././././.][././B/././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],35] BITMAP 17
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],35][miriel026] TO
socket 1[core 20[hwt 0]]: [./././././././././././.][././././././././B/././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],36] BITMAP 6
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],36][miriel026] TO
socket 0[core 3[hwt 0]]: [./././B/./././././././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],37] BITMAP 18
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],37][miriel026] TO
socket 0[core 9[hwt 0]]: [./././././././././B/./.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],38] BITMAP 7
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],38][miriel026] TO
socket 1[core 15[hwt 0]]: [./././././././././././.][./././B/./././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],39] BITMAP 19
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],39][miriel026] TO
socket 1[core 21[hwt 0]]: [./././././././././././.][./././././././././B/./.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],40] BITMAP 8
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],40][miriel026] TO
socket 0[core 4[hwt 0]]: [././././B/././././././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],41] BITMAP 20
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],41][miriel026] TO
socket 0[core 10[hwt 0]]: [././././././././././B/.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],42] BITMAP 9
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],42][miriel026] TO
socket 1[core 16[hwt 0]]: [./././././././././././.][././././B/././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],43] BITMAP 21
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],43][miriel026] TO
socket 1[core 22[hwt 0]]: [./././././././././././.][././././././././././B/.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],44] BITMAP 10
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],44][miriel026] TO
socket 0[core 5[hwt 0]]: [./././././B/./././././.][./././././././././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],45] BITMAP 22
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],45][miriel026] TO
socket 0[core 11[hwt 0]]: [./././././././././././B][./././././././././././.]
[miriel026:165125] mca:rmaps: compute bindings for job [29888,1] with
policy CORE[4008]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],46] BITMAP 11
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],46][miriel026] TO
socket 1[core 17[hwt 0]]: [./././././././././././.][./././././B/./././././.]
[miriel025:62329] [[29888,0],1] GOT 1 CPUS
[miriel025:62329] [[29888,0],1] PROC [[29888,1],47] BITMAP 23
[miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],47][miriel026] TO
socket 1[core 23[hwt 0]]: [./././././././././././.][./././././././././././B]
[miriel025:62342] MCW rank 0 not bound
[miriel026:165138] MCW rank 24 not bound
[miriel025:62343] MCW rank 1 not bound
[miriel026:165139] MCW rank 25 not bound
[miriel025:62344] MCW rank 2 not bound
[miriel026:165140] MCW rank 26 not bound
[miriel025:62345] MCW rank 3 not bound
[miriel026:165141] MCW rank 27 not bound
[miriel025:62346] MCW rank 4 not bound
[miriel026:165142] MCW rank 28 not bound
[miriel025:62348] MCW rank 5 not bound
[miriel026:165144] MCW rank 29 not bound
[miriel025:62350] MCW rank 6 not bound
[miriel026:165146] MCW rank 30 not bound
[miriel025:62352] MCW rank 7 not bound
[miriel025:62354] MCW rank 8 not bound
[miriel026:165148] MCW rank 31 not bound
[miriel025:62356] MCW rank 9 not bound
[miriel025:62358] MCW rank 10 not bound
[miriel026:165150] MCW rank 32 not bound
[miriel026:165152] MCW rank 33 not bound
[miriel026:165154] MCW rank 34 not bound
[miriel025:62360] MCW rank 11 not bound
[miriel026:165156] MCW rank 35 not bound
[miriel025:62362] MCW rank 12 not bound
[miriel026:165158] MCW rank 36 not bound
[miriel025:62366] MCW rank 14 not bound
[miriel025:62364] MCW rank 13 not bound
[miriel026:165160] MCW rank 37 not bound
[miriel025:62368] MCW rank 15 not bound
[miriel026:165163] MCW rank 39 not bound
[miriel025:62370] MCW rank 16 not bound
[miriel026:165166] MCW rank 40 not bound
[miriel026:165167] MCW rank 41 not bound
[miriel026:165162] MCW rank 38 not bound
[miriel025:62372] MCW rank 17 not bound
[miriel025:62374] MCW rank 18 not bound
[miriel026:165170] MCW rank 42 not bound
[miriel025:62376] MCW rank 19 not bound
[miriel026:165171] MCW rank 43 not bound
[miriel025:62377] MCW rank 20 not bound
[miriel026:165173] MCW rank 44 not bound
[miriel025:62380] MCW rank 21 not bound
[miriel025:62381] MCW rank 22 not bound
[miriel026:165177] MCW rank 46 not bound
[miriel026:165175] MCW rank 45 not bound
[miriel026:165179] MCW rank 47 not bound
[miriel025:62384] MCW rank 23 not bound


Le 13/04/2017 à 16:52, r...@open-mpi.org a écrit :
Okay, so you login node was able to figure out all the bindings. I don’t see 
any debug output from your compute nodes, which is suspicious.

Try adding --leave-session-attached to the cmd line and let’s see if we can 
capture the compute node daemon’s output

On Apr 13, 2017, at 7:48 AM, Cyril Bordage <cyril.bord...@inria.fr> wrote:

My machine file is: miriel025*24 miriel026*24

Le 13/04/2017 à 16:46, Cyril Bordage a écrit :
There is the output:
##############################################################################
[devel11:80858] [[2965,0],0] rmaps:base set policy with NULL device NONNULL
[devel11:80858] mca:rmaps:select: checking available component mindist
[devel11:80858] mca:rmaps:select: Querying component [mindist]
[devel11:80858] mca:rmaps:select: checking available component ppr
[devel11:80858] mca:rmaps:select: Querying component [ppr]
[devel11:80858] mca:rmaps:select: checking available component rank_file
[devel11:80858] mca:rmaps:select: Querying component [rank_file]
[devel11:80858] mca:rmaps:select: checking available component resilient
[devel11:80858] mca:rmaps:select: Querying component [resilient]
[devel11:80858] mca:rmaps:select: checking available component round_robin
[devel11:80858] mca:rmaps:select: Querying component [round_robin]
[devel11:80858] mca:rmaps:select: checking available component seq
[devel11:80858] mca:rmaps:select: Querying component [seq]
[devel11:80858] [[2965,0],0]: Final mapper priorities
[devel11:80858]         Mapper: ppr Priority: 90
[devel11:80858]         Mapper: seq Priority: 60
[devel11:80858]         Mapper: resilient Priority: 40
[devel11:80858]         Mapper: mindist Priority: 20
[devel11:80858]         Mapper: round_robin Priority: 10
[devel11:80858]         Mapper: rank_file Priority: 0
[devel11:80858] mca:rmaps: mapping job [2965,1]
[devel11:80858] mca:rmaps: setting mapping policies for job [2965,1]
nprocs 48
[devel11:80858] mca:rmaps[169] mapping not set by user - using bynuma
[devel11:80858] mca:rmaps:ppr: job [2965,1] not using ppr mapper PPR
NULL policy PPR NOTSET
[devel11:80858] [[2965,0],0] rmaps:seq called on job [2965,1]
[devel11:80858] mca:rmaps:seq: job [2965,1] not using seq mapper
[devel11:80858] mca:rmaps:resilient: cannot perform initial map of job
[2965,1] - no fault groups
[devel11:80858] mca:rmaps:mindist: job [2965,1] not using mindist mapper
[devel11:80858] mca:rmaps:rr: mapping job [2965,1]
[devel11:80858] [[2965,0],0] Starting with 2 nodes in list
[devel11:80858] [[2965,0],0] Filtering thru apps
[devel11:80858] [[2965,0],0] Retained 2 nodes in list
[devel11:80858] [[2965,0],0] node miriel025 has 24 slots available
[devel11:80858] [[2965,0],0] node miriel026 has 24 slots available
[devel11:80858] AVAILABLE NODES FOR MAPPING:
[devel11:80858]     node: miriel025 daemon: 1
[devel11:80858]     node: miriel026 daemon: 2
[devel11:80858] [[2965,0],0] Starting bookmark at node miriel025
[devel11:80858] [[2965,0],0] Starting at node miriel025
[devel11:80858] mca:rmaps:rr: mapping no-span by NUMANode for job
[2965,1] slots 48 num_procs 48
[devel11:80858] mca:rmaps:rr: found 4 NUMANode objects on node miriel025
[devel11:80858] mca:rmaps:rr: calculated nprocs 24
[devel11:80858] mca:rmaps:rr: assigning nprocs 24
[devel11:80858] mca:rmaps:rr: found 4 NUMANode objects on node miriel026
[devel11:80858] mca:rmaps:rr: calculated nprocs 24
[devel11:80858] mca:rmaps:rr: assigning nprocs 24
[devel11:80858] mca:rmaps:base: computing vpids by slot for job [2965,1]
[devel11:80858] mca:rmaps:base: assigning rank 0 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 1 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 2 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 3 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 4 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 5 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 6 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 7 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 8 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 9 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 10 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 11 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 12 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 13 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 14 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 15 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 16 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 17 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 18 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 19 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 20 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 21 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 22 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 23 to node miriel025
[devel11:80858] mca:rmaps:base: assigning rank 24 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 25 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 26 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 27 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 28 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 29 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 30 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 31 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 32 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 33 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 34 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 35 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 36 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 37 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 38 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 39 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 40 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 41 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 42 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 43 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 44 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 45 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 46 to node miriel026
[devel11:80858] mca:rmaps:base: assigning rank 47 to node miriel026
[devel11:80858] [[2965,0],0] rmaps:base:compute_usage
[devel11:80858] mca:rmaps: compute bindings for job [2965,1] with policy
CORE[4008]
[devel11:80858] [[2965,0],0] bind_depth: 6 map_depth 2
[devel11:80858] mca:rmaps: bind downward for job [2965,1] with bindings CORE
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],0] BITMAP 0
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],0][miriel025] TO
socket 0[core 0[hwt 0]]: [B/././././././././././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],1] BITMAP 12
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],1][miriel025] TO
socket 0[core 6[hwt 0]]: [././././././B/././././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],2] BITMAP 1
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],2][miriel025] TO
socket 1[core 12[hwt 0]]: [./././././././././././.][B/././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],3] BITMAP 13
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],3][miriel025] TO
socket 1[core 18[hwt 0]]: [./././././././././././.][././././././B/././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],4] BITMAP 2
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],4][miriel025] TO
socket 0[core 1[hwt 0]]: [./B/./././././././././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],5] BITMAP 14
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],5][miriel025] TO
socket 0[core 7[hwt 0]]: [./././././././B/./././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],6] BITMAP 3
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],6][miriel025] TO
socket 1[core 13[hwt 0]]: [./././././././././././.][./B/./././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],7] BITMAP 15
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],7][miriel025] TO
socket 1[core 19[hwt 0]]: [./././././././././././.][./././././././B/./././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],8] BITMAP 4
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],8][miriel025] TO
socket 0[core 2[hwt 0]]: [././B/././././././././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],9] BITMAP 16
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],9][miriel025] TO
socket 0[core 8[hwt 0]]: [././././././././B/././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],10] BITMAP 5
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],10][miriel025] TO
socket 1[core 14[hwt 0]]: [./././././././././././.][././B/././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],11] BITMAP 17
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],11][miriel025] TO
socket 1[core 20[hwt 0]]: [./././././././././././.][././././././././B/././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],12] BITMAP 6
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],12][miriel025] TO
socket 0[core 3[hwt 0]]: [./././B/./././././././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],13] BITMAP 18
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],13][miriel025] TO
socket 0[core 9[hwt 0]]: [./././././././././B/./.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],14] BITMAP 7
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],14][miriel025] TO
socket 1[core 15[hwt 0]]: [./././././././././././.][./././B/./././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],15] BITMAP 19
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],15][miriel025] TO
socket 1[core 21[hwt 0]]: [./././././././././././.][./././././././././B/./.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],16] BITMAP 8
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],16][miriel025] TO
socket 0[core 4[hwt 0]]: [././././B/././././././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],17] BITMAP 20
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],17][miriel025] TO
socket 0[core 10[hwt 0]]: [././././././././././B/.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],18] BITMAP 9
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],18][miriel025] TO
socket 1[core 16[hwt 0]]: [./././././././././././.][././././B/././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],19] BITMAP 21
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],19][miriel025] TO
socket 1[core 22[hwt 0]]: [./././././././././././.][././././././././././B/.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],20] BITMAP 10
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],20][miriel025] TO
socket 0[core 5[hwt 0]]: [./././././B/./././././.][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],21] BITMAP 22
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],21][miriel025] TO
socket 0[core 11[hwt 0]]: [./././././././././././B][./././././././././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],22] BITMAP 11
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],22][miriel025] TO
socket 1[core 17[hwt 0]]: [./././././././././././.][./././././B/./././././.]
[devel11:80858] [[2965,0],0] GOT 1 CPUS
[devel11:80858] [[2965,0],0] PROC [[2965,1],23] BITMAP 23
[devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],23][miriel025] TO
socket 1[core 23[hwt 0]]: [./././././././././././.][./././././././././././B]
[miriel025:60980] MCW rank 11 not bound
[miriel025:60990] MCW rank 21 not bound
[miriel025:60981] MCW rank 12 not bound
[miriel025:60979] MCW rank 10 not bound
[miriel025:60977] MCW rank 8 not bound
[miriel025:60970] MCW rank 1 not bound
[miriel025:60972] MCW rank 3 not bound
[miriel025:60984] MCW rank 15 not bound
[miriel026:163985] MCW rank 34 not bound
[miriel026:163993] MCW rank 42 not bound
[miriel026:163981] MCW rank 30 not bound
[miriel026:163983] MCW rank 32 not bound
[miriel025:60975] MCW rank 6 not bound
[miriel025:60986] MCW rank 17 not bound
[miriel025:60992] MCW rank 23 not bound
[miriel025:60973] MCW rank 4 not bound
[miriel025:60978] MCW rank 9 not bound
[miriel025:60969] MCW rank 0 not bound
[miriel025:60991] MCW rank 22 not bound
[miriel025:60974] MCW rank 5 not bound
[miriel025:60982] MCW rank 13 not bound
[miriel025:60989] MCW rank 20 not bound
[miriel025:60988] MCW rank 19 not bound
[miriel025:60983] MCW rank 14 not bound
[miriel025:60987] MCW rank 18 not bound
[miriel025:60976] MCW rank 7 not bound
[miriel026:163996] MCW rank 45 not bound
[miriel026:163979] MCW rank 28 not bound
[miriel026:163990] MCW rank 39 not bound
[miriel026:163976] MCW rank 25 not bound
[miriel026:163997] MCW rank 46 not bound
[miriel025:60971] MCW rank 2 not bound
[miriel026:163995] MCW rank 44 not bound
[miriel026:163987] MCW rank 36 not bound
[miriel026:163982] MCW rank 31 not bound
[miriel025:60985] MCW rank 16 not bound
[miriel026:163980] MCW rank 29 not bound
[miriel026:163975] MCW rank 24 not bound
[miriel026:163978] MCW rank 27 not bound
[miriel026:163992] MCW rank 41 not bound
[miriel026:163991] MCW rank 40 not bound
[miriel026:163998] MCW rank 47 not bound
[miriel026:163986] MCW rank 35 not bound
[miriel026:163984] MCW rank 33 not bound
[miriel026:163989] MCW rank 38 not bound
[miriel026:163994] MCW rank 43 not bound
[miriel026:163988] MCW rank 37 not bound
[miriel026:163977] MCW rank 26 not bound
##############################################################################

Le 13/04/2017 à 16:31, r...@open-mpi.org a écrit :
Try adding "-mca rmaps_base_verbose 5” and see what that output tells us - I 
assume you have a debug build configured, yes (i.e., added --enable-debug to 
configure line)?


On Apr 13, 2017, at 7:28 AM, Cyril Bordage <cyril.bord...@inria.fr> wrote:

When I run this command from the compute node I have also that. But not
when I run it from a login node (with the same machine file).


Cyril.

Le 13/04/2017 à 16:22, r...@open-mpi.org a écrit :
We are asking all these questions because we cannot replicate your problem - so 
we are trying to help you figure out what is different or missing from your 
machine. When I run your cmd line on my system, I get:

[rhc002.cluster:55965] MCW rank 24 bound to socket 0[core 0[hwt 0-1]]: 
[BB/../../../../../../../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 25 bound to socket 1[core 12[hwt 0-1]]: 
[../../../../../../../../../../../..][BB/../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 26 bound to socket 0[core 1[hwt 0-1]]: 
[../BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 27 bound to socket 1[core 13[hwt 0-1]]: 
[../../../../../../../../../../../..][../BB/../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 28 bound to socket 0[core 2[hwt 0-1]]: 
[../../BB/../../../../../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 29 bound to socket 1[core 14[hwt 0-1]]: 
[../../../../../../../../../../../..][../../BB/../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 30 bound to socket 0[core 3[hwt 0-1]]: 
[../../../BB/../../../../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 31 bound to socket 1[core 15[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../BB/../../../../../../../..]
[rhc002.cluster:55965] MCW rank 32 bound to socket 0[core 4[hwt 0-1]]: 
[../../../../BB/../../../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 33 bound to socket 1[core 16[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../BB/../../../../../../..]
[rhc002.cluster:55965] MCW rank 34 bound to socket 0[core 5[hwt 0-1]]: 
[../../../../../BB/../../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 35 bound to socket 1[core 17[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../BB/../../../../../..]
[rhc002.cluster:55965] MCW rank 36 bound to socket 0[core 6[hwt 0-1]]: 
[../../../../../../BB/../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 37 bound to socket 1[core 18[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../BB/../../../../..]
[rhc002.cluster:55965] MCW rank 38 bound to socket 0[core 7[hwt 0-1]]: 
[../../../../../../../BB/../../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 39 bound to socket 1[core 19[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../BB/../../../..]
[rhc002.cluster:55965] MCW rank 40 bound to socket 0[core 8[hwt 0-1]]: 
[../../../../../../../../BB/../../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 41 bound to socket 1[core 20[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../BB/../../..]
[rhc002.cluster:55965] MCW rank 42 bound to socket 0[core 9[hwt 0-1]]: 
[../../../../../../../../../BB/../..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 43 bound to socket 1[core 21[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../../BB/../..]
[rhc002.cluster:55965] MCW rank 44 bound to socket 0[core 10[hwt 0-1]]: 
[../../../../../../../../../../BB/..][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 45 bound to socket 1[core 22[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../../../BB/..]
[rhc002.cluster:55965] MCW rank 46 bound to socket 0[core 11[hwt 0-1]]: 
[../../../../../../../../../../../BB][../../../../../../../../../../../..]
[rhc002.cluster:55965] MCW rank 47 bound to socket 1[core 23[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../../../../BB]
[rhc001:197743] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: 
[BB/../../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 1 bound to socket 1[core 12[hwt 0-1]]: 
[../../../../../../../../../../../..][BB/../../../../../../../../../../..]
[rhc001:197743] MCW rank 2 bound to socket 0[core 1[hwt 0-1]]: 
[../BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 3 bound to socket 1[core 13[hwt 0-1]]: 
[../../../../../../../../../../../..][../BB/../../../../../../../../../..]
[rhc001:197743] MCW rank 4 bound to socket 0[core 2[hwt 0-1]]: 
[../../BB/../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 5 bound to socket 1[core 14[hwt 0-1]]: 
[../../../../../../../../../../../..][../../BB/../../../../../../../../..]
[rhc001:197743] MCW rank 6 bound to socket 0[core 3[hwt 0-1]]: 
[../../../BB/../../../../../../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 7 bound to socket 1[core 15[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../BB/../../../../../../../..]
[rhc001:197743] MCW rank 8 bound to socket 0[core 4[hwt 0-1]]: 
[../../../../BB/../../../../../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 9 bound to socket 1[core 16[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../BB/../../../../../../..]
[rhc001:197743] MCW rank 10 bound to socket 0[core 5[hwt 0-1]]: 
[../../../../../BB/../../../../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 11 bound to socket 1[core 17[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../BB/../../../../../..]
[rhc001:197743] MCW rank 12 bound to socket 0[core 6[hwt 0-1]]: 
[../../../../../../BB/../../../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 13 bound to socket 1[core 18[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../BB/../../../../..]
[rhc001:197743] MCW rank 14 bound to socket 0[core 7[hwt 0-1]]: 
[../../../../../../../BB/../../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 15 bound to socket 1[core 19[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../BB/../../../..]
[rhc001:197743] MCW rank 16 bound to socket 0[core 8[hwt 0-1]]: 
[../../../../../../../../BB/../../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 17 bound to socket 1[core 20[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../BB/../../..]
[rhc001:197743] MCW rank 18 bound to socket 0[core 9[hwt 0-1]]: 
[../../../../../../../../../BB/../..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 19 bound to socket 1[core 21[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../../BB/../..]
[rhc001:197743] MCW rank 20 bound to socket 0[core 10[hwt 0-1]]: 
[../../../../../../../../../../BB/..][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 21 bound to socket 1[core 22[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../../../BB/..]
[rhc001:197743] MCW rank 22 bound to socket 0[core 11[hwt 0-1]]: 
[../../../../../../../../../../../BB][../../../../../../../../../../../..]
[rhc001:197743] MCW rank 23 bound to socket 1[core 23[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../../../../BB]

Exactly as expected. You might check that you have libnuma and libnuma-devel 
installed


On Apr 13, 2017, at 6:50 AM, gil...@rist.or.jp wrote:

OK thanks,

we've had some issues in the past when Open MPI assumed that the (login)
node running mpirun has the same topology than the other (compute) nodes.
i just wanted to clear this scenario.

Cheers,

Gilles

----- Original Message -----
I am using the 6886c12 commit.
I have no particular option for the configuration.
I launch my application in the same way as I presented in my firt
email,
there is the exact line: mpirun -np 48 -machinefile mf -bind-to core
-report-bindings ./a.out

lstopo does give the same output on both types on nodes. What is the
purpose of that?

Thanks.


Cyril.

Le 13/04/2017 à 15:24, gil...@rist.or.jp a écrit :
Also, can you please run
lstopo
on both your login and compute nodes ?

Cheers,

Gilles


----- Original Message -----
Can you be a bit more specific?

- What version of Open MPI are you using?
- How did you configure Open MPI?
- How are you launching Open MPI applications?


On Apr 13, 2017, at 9:08 AM, Cyril Bordage <cyril.bord...@inria.fr
wrote:
Hi,

now this bug happens also when I launch my mpirun command from the
compute node.


Cyril.

Le 06/04/2017 à 05:38, r...@open-mpi.org a écrit :
I believe this has been fixed now - please let me know

On Mar 30, 2017, at 1:57 AM, Cyril Bordage <cyril.bordage@inria.
fr
wrote:
Hello,

I am using the git version of MPI with "-bind-to core -report-
bindings"
and I get that for all processes:
[miriel010:160662] MCW rank 0 not bound


When I use an old version I get:
[miriel010:44921] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././././././././.][./././././././././././.]

 From git bisect the culprit seems to be: 48fc339

This bug happends only when I launch my mpirun command from a
login node
and not
from a compute node.


Cyril.
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

--
Jeff Squyres
jsquy...@cisco.com

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to