Ah, wait - I had missed your bind-to core directive. With that, it does indeed behave poorly, so I can now replicate.
> On Apr 14, 2017, at 2:21 AM, r...@open-mpi.org wrote: > > Sorry, but both of your non-working examples work fine for me: > > $ mpirun -n 16 -host rhc002:16 --report-bindings /bin/true > [rhc002.cluster:63444] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63444] MCW rank 1 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63444] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63444] MCW rank 3 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63444] MCW rank 4 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63444] MCW rank 5 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63444] MCW rank 6 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63444] MCW rank 7 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63444] MCW rank 8 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63444] MCW rank 9 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63444] MCW rank 10 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63444] MCW rank 11 bound to socket 1[core 12[hwt 0-1]], > socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt > 0-1]], socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core > 18[hwt 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket > 1[core 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63444] MCW rank 12 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63444] MCW rank 13 bound to socket 1[core 12[hwt 0-1]], > socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt > 0-1]], socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core > 18[hwt 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket > 1[core 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63444] MCW rank 14 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63444] MCW rank 15 bound to socket 1[core 12[hwt 0-1]], > socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt > 0-1]], socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core > 18[hwt 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket > 1[core 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > > > $ mpirun -n 16 -host rhc002:16 --report-bindings ./hello > [rhc002.cluster:63525] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63525] MCW rank 1 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63525] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63525] MCW rank 3 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63525] MCW rank 4 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63525] MCW rank 5 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63525] MCW rank 6 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63525] MCW rank 7 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63525] MCW rank 8 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63525] MCW rank 9 bound to socket 1[core 12[hwt 0-1]], socket > 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], > socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt > 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket 1[core > 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63525] MCW rank 10 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63525] MCW rank 11 bound to socket 1[core 12[hwt 0-1]], > socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt > 0-1]], socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core > 18[hwt 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket > 1[core 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63525] MCW rank 12 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63525] MCW rank 13 bound to socket 1[core 12[hwt 0-1]], > socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt > 0-1]], socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core > 18[hwt 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket > 1[core 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [rhc002.cluster:63525] MCW rank 14 bound to socket 0[core 0[hwt 0-1]], socket > 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], > socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt > 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core > 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../../../..] > [rhc002.cluster:63525] MCW rank 15 bound to socket 1[core 12[hwt 0-1]], > socket 1[core 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt > 0-1]], socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core > 18[hwt 0-1]], socket 1[core 19[hwt 0-1]], socket 1[core 20[hwt 0-1]], socket > 1[core 21[hwt 0-1]], socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: > [../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > Hello, World, I am 1 of 16 [15 local peers]: get_cpubind: 0 bitmap 12-23,36-47 > Hello, World, I am 5 of 16 [15 local peers]: get_cpubind: 0 bitmap 12-23,36-47 > Hello, World, I am 11 of 16 [15 local peers]: get_cpubind: 0 bitmap > 12-23,36-47 > Hello, World, I am 15 of 16 [15 local peers]: get_cpubind: 0 bitmap > 12-23,36-47 > Hello, World, I am 3 of 16 [15 local peers]: get_cpubind: 0 bitmap 12-23,36-47 > Hello, World, I am 9 of 16 [15 local peers]: get_cpubind: 0 bitmap 12-23,36-47 > Hello, World, I am 13 of 16 [15 local peers]: get_cpubind: 0 bitmap > 12-23,36-47 > Hello, World, I am 0 of 16 [15 local peers]: get_cpubind: 0 bitmap 0-11,24-35 > Hello, World, I am 2 of 16 [15 local peers]: get_cpubind: 0 bitmap 0-11,24-35 > Hello, World, I am 4 of 16 [15 local peers]: get_cpubind: 0 bitmap 0-11,24-35 > Hello, World, I am 6 of 16 [15 local peers]: get_cpubind: 0 bitmap 0-11,24-35 > Hello, World, I am 7 of 16 [15 local peers]: get_cpubind: 0 bitmap 12-23,36-47 > Hello, World, I am 8 of 16 [15 local peers]: get_cpubind: 0 bitmap 0-11,24-35 > Hello, World, I am 10 of 16 [15 local peers]: get_cpubind: 0 bitmap 0-11,24-35 > Hello, World, I am 12 of 16 [15 local peers]: get_cpubind: 0 bitmap 0-11,24-35 > Hello, World, I am 14 of 16 [15 local peers]: get_cpubind: 0 bitmap 0-11,24-35 > > Can you dig deeper and see why, with current master, you are seeing something > different? Mine is a debug build, if that makes a difference. > > Ralph > > >> On Apr 13, 2017, at 10:24 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: >> >> Ralph, >> >> >> i can simply reproduce the issue with two nodes and the latest master >> >> all commands are ran on n1, which has the same topology (2 sockets * 8 cores >> each) than n2 >> >> >> 1) everything works >> >> $ mpirun -np 16 -bind-to core --report-bindings true >> [n1:29794] MCW rank 0 bound to socket 0[core 0[hwt 0]]: >> [B/././././././.][./././././././.] >> [n1:29794] MCW rank 1 bound to socket 1[core 8[hwt 0]]: >> [./././././././.][B/././././././.] >> [n1:29794] MCW rank 2 bound to socket 0[core 1[hwt 0]]: >> [./B/./././././.][./././././././.] >> [n1:29794] MCW rank 3 bound to socket 1[core 9[hwt 0]]: >> [./././././././.][./B/./././././.] >> [n1:29794] MCW rank 4 bound to socket 0[core 2[hwt 0]]: >> [././B/././././.][./././././././.] >> [n1:29794] MCW rank 5 bound to socket 1[core 10[hwt 0]]: >> [./././././././.][././B/././././.] >> [n1:29794] MCW rank 6 bound to socket 0[core 3[hwt 0]]: >> [./././B/./././.][./././././././.] >> [n1:29794] MCW rank 7 bound to socket 1[core 11[hwt 0]]: >> [./././././././.][./././B/./././.] >> [n1:29794] MCW rank 8 bound to socket 0[core 4[hwt 0]]: >> [././././B/././.][./././././././.] >> [n1:29794] MCW rank 9 bound to socket 1[core 12[hwt 0]]: >> [./././././././.][././././B/././.] >> [n1:29794] MCW rank 10 bound to socket 0[core 5[hwt 0]]: >> [./././././B/./.][./././././././.] >> [n1:29794] MCW rank 11 bound to socket 1[core 13[hwt 0]]: >> [./././././././.][./././././B/./.] >> [n1:29794] MCW rank 12 bound to socket 0[core 6[hwt 0]]: >> [././././././B/.][./././././././.] >> [n1:29794] MCW rank 13 bound to socket 1[core 14[hwt 0]]: >> [./././././././.][././././././B/.] >> [n1:29794] MCW rank 14 bound to socket 0[core 7[hwt 0]]: >> [./././././././B][./././././././.] >> [n1:29794] MCW rank 15 bound to socket 1[core 15[hwt 0]]: >> [./././././././.][./././././././B] >> >> $ mpirun -np 16 -bind-to core --host n1:16 --report-bindings true >> [n1:29850] MCW rank 0 bound to socket 0[core 0[hwt 0]]: >> [B/././././././.][./././././././.] >> [n1:29850] MCW rank 1 bound to socket 1[core 8[hwt 0]]: >> [./././././././.][B/././././././.] >> [n1:29850] MCW rank 2 bound to socket 0[core 1[hwt 0]]: >> [./B/./././././.][./././././././.] >> [n1:29850] MCW rank 3 bound to socket 1[core 9[hwt 0]]: >> [./././././././.][./B/./././././.] >> [n1:29850] MCW rank 4 bound to socket 0[core 2[hwt 0]]: >> [././B/././././.][./././././././.] >> [n1:29850] MCW rank 5 bound to socket 1[core 10[hwt 0]]: >> [./././././././.][././B/././././.] >> [n1:29850] MCW rank 6 bound to socket 0[core 3[hwt 0]]: >> [./././B/./././.][./././././././.] >> [n1:29850] MCW rank 7 bound to socket 1[core 11[hwt 0]]: >> [./././././././.][./././B/./././.] >> [n1:29850] MCW rank 8 bound to socket 0[core 4[hwt 0]]: >> [././././B/././.][./././././././.] >> [n1:29850] MCW rank 9 bound to socket 1[core 12[hwt 0]]: >> [./././././././.][././././B/././.] >> [n1:29850] MCW rank 10 bound to socket 0[core 5[hwt 0]]: >> [./././././B/./.][./././././././.] >> [n1:29850] MCW rank 11 bound to socket 1[core 13[hwt 0]]: >> [./././././././.][./././././B/./.] >> [n1:29850] MCW rank 12 bound to socket 0[core 6[hwt 0]]: >> [././././././B/.][./././././././.] >> [n1:29850] MCW rank 13 bound to socket 1[core 14[hwt 0]]: >> [./././././././.][././././././B/.] >> [n1:29850] MCW rank 14 bound to socket 0[core 7[hwt 0]]: >> [./././././././B][./././././././.] >> [n1:29850] MCW rank 15 bound to socket 1[core 15[hwt 0]]: >> [./././././././.][./././././././B] >> >> 2) with an other node >> >> $ mpirun -np 16 -bind-to core --host n2:16 --report-bindings true >> >> /* no output with a non MPI app !*/ >> >> $ mpirun -np 16 -bind-to core --host n2:16 --report-bindings ./hello_c >> [n2:52851] MCW rank 0 not bound >> [n2:52852] MCW rank 1 not bound >> [n2:52853] MCW rank 2 not bound >> [n2:52854] MCW rank 3 not bound >> [n2:52855] MCW rank 4 not bound >> [n2:52856] MCW rank 5 not bound >> [n2:52857] MCW rank 6 not bound >> [n2:52859] MCW rank 7 not bound >> [n2:52861] MCW rank 8 not bound >> [n2:52864] MCW rank 9 not bound >> [n2:52866] MCW rank 10 not bound >> [n2:52869] MCW rank 11 not bound >> [n2:52877] MCW rank 15 not bound >> [n2:52871] MCW rank 12 not bound >> [n2:52873] MCW rank 13 not bound >> [n2:52876] MCW rank 14 not bound >> Hello, world, I am 0 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 1 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 2 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 3 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 4 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 5 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 6 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 7 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 8 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 9 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 10 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 11 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 12 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 13 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 14 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> Hello, world, I am 15 of 16, (Open MPI v4.0.0a1, package: Open MPI gilles@n >> Distribution, ident: 4.0.0a1, repo rev: v2.x-dev-4028-g3202de8, Unreleased >> developer copy, 165) >> >> /* binding is reported with a MPI app, but no binding has been performed */ >> >> 3) workaround : use -map-by core (works even with a non MPI app) >> >> $ mpirun -np 16 -bind-to core -map-by core --host n2:16 --report-bindings >> true >> [n2:52982] MCW rank 0 bound to socket 0[core 0[hwt 0]]: >> [B/././././././.][./././././././.] >> [n2:52982] MCW rank 1 bound to socket 0[core 1[hwt 0]]: >> [./B/./././././.][./././././././.] >> [n2:52982] MCW rank 2 bound to socket 0[core 2[hwt 0]]: >> [././B/././././.][./././././././.] >> [n2:52982] MCW rank 3 bound to socket 0[core 3[hwt 0]]: >> [./././B/./././.][./././././././.] >> [n2:52982] MCW rank 4 bound to socket 0[core 4[hwt 0]]: >> [././././B/././.][./././././././.] >> [n2:52982] MCW rank 5 bound to socket 0[core 5[hwt 0]]: >> [./././././B/./.][./././././././.] >> [n2:52982] MCW rank 6 bound to socket 0[core 6[hwt 0]]: >> [././././././B/.][./././././././.] >> [n2:52982] MCW rank 7 bound to socket 0[core 7[hwt 0]]: >> [./././././././B][./././././././.] >> [n2:52982] MCW rank 8 bound to socket 1[core 8[hwt 0]]: >> [./././././././.][B/././././././.] >> [n2:52982] MCW rank 9 bound to socket 1[core 9[hwt 0]]: >> [./././././././.][./B/./././././.] >> [n2:52982] MCW rank 10 bound to socket 1[core 10[hwt 0]]: >> [./././././././.][././B/././././.] >> [n2:52982] MCW rank 11 bound to socket 1[core 11[hwt 0]]: >> [./././././././.][./././B/./././.] >> [n2:52982] MCW rank 12 bound to socket 1[core 12[hwt 0]]: >> [./././././././.][././././B/././.] >> [n2:52982] MCW rank 13 bound to socket 1[core 13[hwt 0]]: >> [./././././././.][./././././B/./.] >> [n2:52982] MCW rank 14 bound to socket 1[core 14[hwt 0]]: >> [./././././././.][././././././B/.] >> [n2:52982] MCW rank 15 bound to socket 1[core 15[hwt 0]]: >> [./././././././.][./././././././B] >> >> >> note if both nodes are used, binding is just fine >> >> $ mpirun -np 32 -bind-to core --host n1:16,n2:16 --report-bindings true >> [n1:30008] MCW rank 0 bound to socket 0[core 0[hwt 0]]: >> [B/././././././.][./././././././.] >> [n1:30008] MCW rank 1 bound to socket 1[core 8[hwt 0]]: >> [./././././././.][B/././././././.] >> [n1:30008] MCW rank 2 bound to socket 0[core 1[hwt 0]]: >> [./B/./././././.][./././././././.] >> [n1:30008] MCW rank 3 bound to socket 1[core 9[hwt 0]]: >> [./././././././.][./B/./././././.] >> [n1:30008] MCW rank 4 bound to socket 0[core 2[hwt 0]]: >> [././B/././././.][./././././././.] >> [n1:30008] MCW rank 5 bound to socket 1[core 10[hwt 0]]: >> [./././././././.][././B/././././.] >> [n1:30008] MCW rank 6 bound to socket 0[core 3[hwt 0]]: >> [./././B/./././.][./././././././.] >> [n1:30008] MCW rank 7 bound to socket 1[core 11[hwt 0]]: >> [./././././././.][./././B/./././.] >> [n1:30008] MCW rank 8 bound to socket 0[core 4[hwt 0]]: >> [././././B/././.][./././././././.] >> [n1:30008] MCW rank 9 bound to socket 1[core 12[hwt 0]]: >> [./././././././.][././././B/././.] >> [n1:30008] MCW rank 10 bound to socket 0[core 5[hwt 0]]: >> [./././././B/./.][./././././././.] >> [n1:30008] MCW rank 11 bound to socket 1[core 13[hwt 0]]: >> [./././././././.][./././././B/./.] >> [n1:30008] MCW rank 12 bound to socket 0[core 6[hwt 0]]: >> [././././././B/.][./././././././.] >> [n1:30008] MCW rank 13 bound to socket 1[core 14[hwt 0]]: >> [./././././././.][././././././B/.] >> [n1:30008] MCW rank 14 bound to socket 0[core 7[hwt 0]]: >> [./././././././B][./././././././.] >> [n1:30008] MCW rank 15 bound to socket 1[core 15[hwt 0]]: >> [./././././././.][./././././././B] >> [n2:53187] MCW rank 16 bound to socket 0[core 0[hwt 0]]: >> [B/././././././.][./././././././.] >> [n2:53187] MCW rank 17 bound to socket 1[core 8[hwt 0]]: >> [./././././././.][B/././././././.] >> [n2:53187] MCW rank 18 bound to socket 0[core 1[hwt 0]]: >> [./B/./././././.][./././././././.] >> [n2:53187] MCW rank 19 bound to socket 1[core 9[hwt 0]]: >> [./././././././.][./B/./././././.] >> [n2:53187] MCW rank 20 bound to socket 0[core 2[hwt 0]]: >> [././B/././././.][./././././././.] >> [n2:53187] MCW rank 21 bound to socket 1[core 10[hwt 0]]: >> [./././././././.][././B/././././.] >> [n2:53187] MCW rank 22 bound to socket 0[core 3[hwt 0]]: >> [./././B/./././.][./././././././.] >> [n2:53187] MCW rank 23 bound to socket 1[core 11[hwt 0]]: >> [./././././././.][./././B/./././.] >> [n2:53187] MCW rank 24 bound to socket 0[core 4[hwt 0]]: >> [././././B/././.][./././././././.] >> [n2:53187] MCW rank 25 bound to socket 1[core 12[hwt 0]]: >> [./././././././.][././././B/././.] >> [n2:53187] MCW rank 26 bound to socket 0[core 5[hwt 0]]: >> [./././././B/./.][./././././././.] >> [n2:53187] MCW rank 27 bound to socket 1[core 13[hwt 0]]: >> [./././././././.][./././././B/./.] >> [n2:53187] MCW rank 28 bound to socket 0[core 6[hwt 0]]: >> [././././././B/.][./././././././.] >> [n2:53187] MCW rank 29 bound to socket 1[core 14[hwt 0]]: >> [./././././././.][././././././B/.] >> [n2:53187] MCW rank 30 bound to socket 0[core 7[hwt 0]]: >> [./././././././B][./././././././.] >> [n2:53187] MCW rank 31 bound to socket 1[core 15[hwt 0]]: >> [./././././././.][./././././././B] >> >> >> Cheers, >> >> >> Gilles >> >> >> On 4/14/2017 12:43 AM, r...@open-mpi.org wrote: >>> All right, let’s replace rmaps_base_verbose with odls_base_verbose and see >>> what that saids >>> >>>> On Apr 13, 2017, at 8:34 AM, Cyril Bordage <cyril.bord...@inria.fr> wrote: >>>> >>>> '-report-bindings' does that. >>>> I used this option because the ranks did not seem to be binded (if I use >>>> a rank file the performace is far better). >>>> >>>> Le 13/04/2017 à 17:24, r...@open-mpi.org a écrit : >>>>> Okay, so as far as OMPI is concerned, it correctly bound everyone! So how >>>>> are you generating this output claiming it isn’t bound? >>>>> >>>>>> On Apr 13, 2017, at 7:57 AM, Cyril Bordage <cyril.bord...@inria.fr> >>>>>> wrote: >>>>>> >>>>>> devel11:17550] [[29888,0],0] rmaps:base set policy with NULL device >>>>>> NONNULL >>>>>> [devel11:17550] mca:rmaps:select: checking available component mindist >>>>>> [devel11:17550] mca:rmaps:select: Querying component [mindist] >>>>>> [devel11:17550] mca:rmaps:select: checking available component ppr >>>>>> [devel11:17550] mca:rmaps:select: Querying component [ppr] >>>>>> [devel11:17550] mca:rmaps:select: checking available component rank_file >>>>>> [devel11:17550] mca:rmaps:select: Querying component [rank_file] >>>>>> [devel11:17550] mca:rmaps:select: checking available component resilient >>>>>> [devel11:17550] mca:rmaps:select: Querying component [resilient] >>>>>> [devel11:17550] mca:rmaps:select: checking available component >>>>>> round_robin >>>>>> [devel11:17550] mca:rmaps:select: Querying component [round_robin] >>>>>> [devel11:17550] mca:rmaps:select: checking available component seq >>>>>> [devel11:17550] mca:rmaps:select: Querying component [seq] >>>>>> [devel11:17550] [[29888,0],0]: Final mapper priorities >>>>>> [devel11:17550] Mapper: ppr Priority: 90 >>>>>> [devel11:17550] Mapper: seq Priority: 60 >>>>>> [devel11:17550] Mapper: resilient Priority: 40 >>>>>> [devel11:17550] Mapper: mindist Priority: 20 >>>>>> [devel11:17550] Mapper: round_robin Priority: 10 >>>>>> [devel11:17550] Mapper: rank_file Priority: 0 >>>>>> [miriel025:62329] [[29888,0],1] rmaps:base set policy with NULL device >>>>>> NONNULL >>>>>> [miriel025:62329] mca:rmaps:select: checking available component mindist >>>>>> [miriel025:62329] mca:rmaps:select: Querying component [mindist] >>>>>> [miriel025:62329] mca:rmaps:select: checking available component ppr >>>>>> [miriel025:62329] mca:rmaps:select: Querying component [ppr] >>>>>> [miriel025:62329] mca:rmaps:select: checking available component >>>>>> rank_file >>>>>> [miriel025:62329] mca:rmaps:select: Querying component [rank_file] >>>>>> [miriel026:165125] [[29888,0],2] rmaps:base set policy with NULL device >>>>>> NONNULL >>>>>> [miriel026:165125] mca:rmaps:select: checking available component mindist >>>>>> [miriel026:165125] mca:rmaps:select: Querying component [mindist] >>>>>> [miriel026:165125] mca:rmaps:select: checking available component ppr >>>>>> [miriel026:165125] mca:rmaps:select: Querying component [ppr] >>>>>> [miriel026:165125] mca:rmaps:select: checking available component >>>>>> rank_file >>>>>> [miriel026:165125] mca:rmaps:select: Querying component [rank_file] >>>>>> [miriel026:165125] mca:rmaps:select: checking available component >>>>>> resilient >>>>>> [miriel026:165125] mca:rmaps:select: Querying component [resilient] >>>>>> [miriel026:165125] mca:rmaps:select: checking available component >>>>>> round_robin >>>>>> [miriel026:165125] mca:rmaps:select: Querying component [round_robin] >>>>>> [miriel026:165125] mca:rmaps:select: checking available component seq >>>>>> [miriel026:165125] mca:rmaps:select: Querying component [seq] >>>>>> [miriel026:165125] [[29888,0],2]: Final mapper priorities >>>>>> [miriel026:165125] Mapper: ppr Priority: 90 >>>>>> [[devel11:17550] mca:rmaps: mapping job [29888,1] >>>>>> [devel11:17550] mca:rmaps: setting mapping policies for job [29888,1] >>>>>> nprocs 48 >>>>>> [devel11:17550] mca:rmaps[169] mapping not set by user - using bynuma >>>>>> [devel11:17550] mca:rmaps:ppr: job [29888,1] not using ppr mapper PPR >>>>>> NULL policy PPR NOTSET >>>>>> [devel11:17550] [[29888,0],0] rmaps:seq called on job [29888,1] >>>>>> [devel11:17550] mca:rmaps:seq: job [29888,1] not using seq mapper >>>>>> [devel11:17550] mca:rmaps:resilient: cannot perform initial map of job >>>>>> [29888,1] - no fault groups >>>>>> [devel11:17550] mca:rmaps:mindist: job [29888,1] not using mindist mapper >>>>>> [devel11:17550] mca:rmaps:rr: mapping job [29888,1] >>>>>> [devel11:17550] [[29888,0],0] Starting with 2 nodes in list >>>>>> [devel11:17550] [[29888,0],0] Filtering thru apps >>>>>> [miriel025:62329] mca:rmaps:select: checking available component >>>>>> resilient >>>>>> [miriel025:62329] mca:rmaps:select: Querying component [resilient] >>>>>> [miriel025:62329] mca:rmaps:select: checking available component >>>>>> round_robin >>>>>> [miriel025:62329] mca:rmaps:select: Querying component [round_robin] >>>>>> [miriel025:62329] mca:rmaps:select: checking available component seq >>>>>> [miriel025:62329] mca:rmaps:select: Querying component [seq] >>>>>> [miriel025:62329] [[29888,0],1]: Final mapper priorities >>>>>> [miriel025:62329] Mapper: ppr Priority: 90 >>>>>> [miriel025:62329] Mapper: seq Priority: 60 >>>>>> [miriel025:62329] Mapper: resilient Priority: 40 >>>>>> [miriel025:62329] Mapper: mindist Priority: 20 >>>>>> [miriel025:62329] Mapper: round_robin Priority: 10 >>>>>> [miriel025:62329] Mapper: rank_file Priority: 0 >>>>>> [miriel025:62329] mca:rmaps: mapping job [29888,1] >>>>>> [miriel025:62329] mca:rmaps: setting mapping policies for job [29888,1] >>>>>> nprocs 48 >>>>>> [miriel025:62329] mca:rmaps[169] mapping not set by user - using bynuma >>>>>> [miriel025:62329] mca:rmaps:ppr: job [29888,1] not using ppr mapper PPR >>>>>> NULL policy PPR NOTSET >>>>>> [miriel025:62329] [[29888,0],1] rmaps:seq called on job [29888,1] >>>>>> [miriel025:62329] mca:rmaps:seq: job [29888,1] not using seq mapper >>>>>> [miriel025:62329] mca:rmaps:resilient: cannot perform initial map of job >>>>>> [29888,1] - no fault groups >>>>>> [miriel025:62329] mca:rmaps:mindist: job [29888,1] not using mindist >>>>>> mapper >>>>>> [miriel025:62329] mca:rmaps:rr: mapping job [29888,1] >>>>>> [miriel025:62329] [[29888,0],1] Starting with 2 nodes in list >>>>>> [miriel025:62329] [[29888,0],1] Filtering thru apps >>>>>> [devel11:17550] [[29888,0],0] Retained 2 nodes in list >>>>>> [devel11:17550] [[29888,0],0] node miriel025 has 24 slots available >>>>>> [devel11:17550] [[29888,0],0] node miriel026 has 24 slots available >>>>>> [devel11:17550] AVAILABLE NODES FOR MAPPING: >>>>>> [devel11:17550] node: miriel025 daemon: 1 >>>>>> [devel11:17550] node: miriel026 daemon: 2 >>>>>> [devel11:17550] [[29888,0],0] Starting bookmark at node miriel025 >>>>>> [devel11:17550] [[29888,0],0] Starting at node miriel025 >>>>>> [devel11:17550] mca:rmaps:rr: mapping no-span by NUMANode for job >>>>>> [29888,1] slots 48 num_procs 48 >>>>>> [devel11:17550] mca:rmaps:rr: found 4 NUMANode objects on node miriel025 >>>>>> [devel11:17550] mca:rmaps:rr: calculated nprocs 24 >>>>>> [devel11:17550] mca:rmaps:rr: assigning nprocs 24 >>>>>> [devel11:17550] mca:rmaps:rr: found 4 NUMANode objects on node miriel026 >>>>>> [devel11:17550] mca:rmaps:rr: calculated nprocs 24 >>>>>> [devel11:17550] mca:rmaps:rr: assigning nprocs 24 >>>>>> [devel11:17550] mca:rmaps:base: computing vpids by slot for job [29888,1] >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 0 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 1 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 2 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 3 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 4 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 5 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 6 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 7 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 8 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 9 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 10 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 11 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 12 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 13 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 14 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 15 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 16 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 17 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 18 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 19 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 20 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 21 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 22 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 23 to node miriel025 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 24 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 25 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 26 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 27 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 28 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 29 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 30 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 31 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 32 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 33 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 34 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 35 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 36 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 37 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 38 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 39 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 40 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 41 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 42 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 43 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 44 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 45 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 46 to node miriel026 >>>>>> [devel11:17550] mca:rmaps:base: assigning rank 47 to node miriel026 >>>>>> [devel11:17550] [[29888,0],0] rmaps:base:compute_usage >>>>>> [devel11:17550] mca:rmaps: compute bindings for job [29888,1] with >>>>>> policy CORE[4008] >>>>>> [devel11:17550] [[29888,0],0] bind_depth: 6 map_depth 2 >>>>>> [devel11:17550] mca:rmaps: bind downward for job [29888,1] with bindings >>>>>> CORE >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],0] BITMAP 0 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],0][miriel025] TO >>>>>> socket 0[core 0[hwt 0]]: >>>>>> [B/././././././././././.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],1] BITMAP 12 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],1][miriel025] TO >>>>>> socket 0[core 6[hwt 0]]: >>>>>> [././././././B/././././.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],2] BITMAP 1 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],2][miriel025] TO >>>>>> socket 1[core 12[hwt 0]]: >>>>>> [./././././././././././.][B/././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],3] BITMAP 13 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],3][miriel025] TO >>>>>> socket 1[core 18[hwt 0]]: >>>>>> [./././././././././././.][././././././B/././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],4] BITMAP 2 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],4][miriel025] TO >>>>>> socket 0[core 1[hwt 0]]: >>>>>> [./B/./././././././././.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],5] BITMAP 14 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],5][miriel025] TO >>>>>> socket 0[core 7[hwt 0]]: >>>>>> [./././././././B/./././.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],6] BITMAP 3 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],6][miriel025] TO >>>>>> socket 1[core 13[hwt 0]]: >>>>>> [./././././././././././.][./B/./././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],7] BITMAP 15 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],7][miriel025] TO >>>>>> socket 1[core 19[hwt 0]]: >>>>>> [./././././././././././.][./././././././B/./././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],8] BITMAP 4 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],8][miriel025] TO >>>>>> socket 0[core 2[hwt 0]]: >>>>>> [././B/././././././././.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],9] BITMAP 16 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],9][miriel025] TO >>>>>> socket 0[core 8[hwt 0]]: >>>>>> [././././././././B/././.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],10] BITMAP 5 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],10][miriel025] TO >>>>>> socket 1[core 14[hwt 0]]: >>>>>> [./././././././././././.][././B/././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],11] BITMAP 17 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],11][miriel025] TO >>>>>> socket 1[core 20[hwt 0]]: >>>>>> [./././././././././././.][././././././././B/././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],12] BITMAP 6 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],12][miriel025] TO >>>>>> socket 0[core 3[hwt 0]]: >>>>>> [./././B/./././././././.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],13] BITMAP 18 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],13][miriel025] TO >>>>>> socket 0[core 9[hwt 0]]: >>>>>> [./././././././././B/./.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],14] BITMAP 7 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],14][miriel025] TO >>>>>> socket 1[core 15[hwt 0]]: >>>>>> [./././././././././././.][./././B/./././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],15] BITMAP 19 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],15][miriel025] TO >>>>>> socket 1[core 21[hwt 0]]: >>>>>> [./././././././././././.][./././././././././B/./.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],16] BITMAP 8 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],16][miriel025] TO >>>>>> socket 0[core 4[hwt 0]]: >>>>>> [././././B/././././././.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],17] BITMAP 20 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],17][miriel025] TO >>>>>> socket 0[core 10[hwt 0]]: >>>>>> [././././././././././B/.][./././././././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],18] BITMAP 9 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],18][miriel025] TO >>>>>> socket 1[core 16[hwt 0]]: >>>>>> [./././././././././././.][././././B/././././././.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],19] BITMAP 21 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],19][miriel025] TO >>>>>> socket 1[core 22[hwt 0]]: >>>>>> [./././././././././././.][././././././././././B/.] >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],20] BITMAP 10 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],20][miriel025] TO >>>>>> socket 0[core 5[hwt 0]]: >>>>>> [./././././B/./././././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] Retained 2 nodes in list >>>>>> [miriel025:62329] [[29888,0],1] node miriel025 has 24 slots available >>>>>> [miriel025:62329] [[29888,0],1] node miriel026 has 24 slots available >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [miriel025:62329] AVAILABLE NODES FOR MAPPING: >>>>>> [miriel025:62329] node: miriel025 daemon: 1 >>>>>> [miriel025:62329] node: miriel026 daemon: 2 >>>>>> [miriel025:62329] [[29888,0],1] Starting bookmark at node miriel025 >>>>>> [miriel025:62329] [[29888,0],1] Starting at node miriel025 >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],21] BITMAP 22 >>>>>> [miriel025:62329] mca:rmaps:rr: mapping no-span by NUMANode for job >>>>>> [29888,1] slots 48 num_procs 48 >>>>>> [miriel025:62329] mca:rmaps:rr: found 4 NUMANode objects on node >>>>>> miriel025 >>>>>> [miriel025:62329] mca:rmaps:rr: calculated nprocs 24 >>>>>> [miriel025:62329] mca:rmaps:rr: assigning nprocs 24 >>>>>> [miriel025:62329] mca:rmaps:rr: found 4 NUMANode objects on node >>>>>> miriel026 >>>>>> [miriel025:62329] mca:rmaps:rr: calculated nprocs 24 >>>>>> [miriel025:62329] mca:rmaps:rr: assigning nprocs 24 >>>>>> [miriel025:62329] mca:rmaps:base: computing vpids by slot for job >>>>>> [29888,1] >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 0 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 1 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 2 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 3 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 4 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 5 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 6 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 7 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 8 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 9 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 10 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 11 to node miriel025 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],21][miriel025] TO >>>>>> socket 0[core 11[hwt 0]]: >>>>>> [./././././././././././B][./././././././././././.] >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 12 to node miriel025 >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 13 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 14 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 15 to node miriel025 >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],22] BITMAP 11 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 16 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 17 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 18 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 19 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 20 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 21 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 22 to node miriel025 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 23 to node miriel025 >>>>>> miriel026:165125] Mapper: seq Priority: 60 >>>>>> [miriel026:165125] Mapper: resilient Priority: 40 >>>>>> [miriel026:165125] Mapper: mindist Priority: 20 >>>>>> [miriel026:165125] Mapper: round_robin Priority: 10 >>>>>> [miriel026:165125] Mapper: rank_file Priority: 0 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 24 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 25 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps: mapping job [29888,1] >>>>>> [miriel026:165125] mca:rmaps: setting mapping policies for job [29888,1] >>>>>> nprocs 48 >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],22][miriel025] TO >>>>>> socket 1[core 17[hwt 0]]: >>>>>> [./././././././././././.][./././././B/./././././.] >>>>>> [miriel026:165125] mca:rmaps[169] mapping not set by user - using bynuma >>>>>> [miriel026:165125] mca:rmaps:ppr: job [29888,1] not using ppr mapper PPR >>>>>> NULL policy PPR NOTSET >>>>>> [devel11:17550] [[29888,0],0] GOT 1 CPUS >>>>>> [miriel026:165125] [[29888,0],2] rmaps:seq called on job [29888,1] >>>>>> [miriel026:165125] mca:rmaps:seq: job [29888,1] not using seq mapper >>>>>> [devel11:17550] [[29888,0],0] PROC [[29888,1],23] BITMAP 23 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 26 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 27 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 28 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 29 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:resilient: cannot perform initial map of >>>>>> job [29888,1] - no fault groups >>>>>> [miriel026:165125] mca:rmaps:mindist: job [29888,1] not using mindist >>>>>> mapper >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 30 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 31 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 32 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 33 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:rr: mapping job [29888,1] >>>>>> [miriel026:165125] [[29888,0],2] Starting with 2 nodes in list >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 34 to node miriel026 >>>>>> [miriel026:165125] [[29888,0],2] Filtering thru apps >>>>>> [devel11:17550] [[29888,0],0] BOUND PROC [[29888,1],23][miriel025] TO >>>>>> socket 1[core 23[hwt 0]]: >>>>>> [./././././././././././.][./././././././././././B] >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 35 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 36 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 37 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 38 to node miriel026 >>>>>> [miriel026:165125] [[29888,0],2] Retained 2 nodes in list >>>>>> [miriel026:165125] [[29888,0],2] node miriel025 has 24 slots available >>>>>> [miriel026:165125] [[29888,0],2] node miriel026 has 24 slots available >>>>>> [miriel026:165125] AVAILABLE NODES FOR MAPPING: >>>>>> [miriel026:165125] node: miriel025 daemon: 1 >>>>>> [miriel026:165125] node: miriel026 daemon: 2 >>>>>> [miriel026:165125] [[29888,0],2] Starting bookmark at node miriel025 >>>>>> [miriel026:165125] [[29888,0],2] Starting at node miriel025 >>>>>> [miriel026:165125] mca:rmaps:rr: mapping no-span by NUMANode for job >>>>>> [29888,1] slots 48 num_procs 48 >>>>>> [miriel026:165125] mca:rmaps:rr: found 4 NUMANode objects on node >>>>>> miriel025 >>>>>> [miriel026:165125] mca:rmaps:rr: calculated nprocs 24 >>>>>> [miriel026:165125] mca:rmaps:rr: assigning nprocs 24 >>>>>> [miriel026:165125] mca:rmaps:rr: found 4 NUMANode objects on node >>>>>> miriel026 >>>>>> [miriel026:165125] mca:rmaps:rr: calculated nprocs 24 >>>>>> [miriel026:165125] mca:rmaps:rr: assigning nprocs 24 >>>>>> [miriel026:165125] mca:rmaps:base: computing vpids by slot for job >>>>>> [29888,1] >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 0 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 1 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 2 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 3 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 4 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 5 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 6 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 7 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 8 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 9 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 10 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 11 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 12 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 13 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 14 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 15 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 16 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 17 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 18 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 19 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 20 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 21 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 22 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 23 to node miriel025 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 24 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 25 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 26 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 27 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 28 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 29 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 30 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 31 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 32 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 33 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 34 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 35 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 36 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 37 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 38 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 39 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 40 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 41 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 42 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 43 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 44 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 45 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 46 to node miriel026 >>>>>> [miriel026:165125] mca:rmaps:base: assigning rank 47 to node miriel026 >>>>>> [miriel026:165125] [[29888,0],2] rmaps:base:compute_usage >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 39 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 40 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 41 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 42 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 43 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 44 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 45 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 46 to node miriel026 >>>>>> [miriel025:62329] mca:rmaps:base: assigning rank 47 to node miriel026 >>>>>> [miriel025:62329] [[29888,0],1] rmaps:base:compute_usage >>>>>> [miriel025:62329] mca:rmaps: compute bindings for job [29888,1] with >>>>>> policy CORE[4008] >>>>>> [miriel025:62329] [[29888,0],1] bind_depth: 6 map_depth 2 >>>>>> [miriel025:62329] mca:rmaps: bind downward for job [29888,1] with >>>>>> bindings CORE >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],24] BITMAP 0 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],24][miriel026] TO >>>>>> socket 0[core 0[hwt 0]]: >>>>>> [B/././././././././././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],25] BITMAP 12 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],25][miriel026] TO >>>>>> socket 0[core 6[hwt 0]]: >>>>>> [././././././B/././././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],26] BITMAP 1 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],26][miriel026] TO >>>>>> socket 1[core 12[hwt 0]]: >>>>>> [./././././././././././.][B/././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],27] BITMAP 13 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],27][miriel026] TO >>>>>> socket 1[core 18[hwt 0]]: >>>>>> [./././././././././././.][././././././B/././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],28] BITMAP 2 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],28][miriel026] TO >>>>>> socket 0[core 1[hwt 0]]: >>>>>> [./B/./././././././././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],29] BITMAP 14 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],29][miriel026] TO >>>>>> socket 0[core 7[hwt 0]]: >>>>>> [./././././././B/./././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],30] BITMAP 3 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],30][miriel026] TO >>>>>> socket 1[core 13[hwt 0]]: >>>>>> [./././././././././././.][./B/./././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],31] BITMAP 15 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],31][miriel026] TO >>>>>> socket 1[core 19[hwt 0]]: >>>>>> [./././././././././././.][./././././././B/./././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],32] BITMAP 4 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],32][miriel026] TO >>>>>> socket 0[core 2[hwt 0]]: >>>>>> [././B/././././././././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],33] BITMAP 16 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],33][miriel026] TO >>>>>> socket 0[core 8[hwt 0]]: >>>>>> [././././././././B/././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],34] BITMAP 5 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],34][miriel026] TO >>>>>> socket 1[core 14[hwt 0]]: >>>>>> [./././././././././././.][././B/././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],35] BITMAP 17 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],35][miriel026] TO >>>>>> socket 1[core 20[hwt 0]]: >>>>>> [./././././././././././.][././././././././B/././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],36] BITMAP 6 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],36][miriel026] TO >>>>>> socket 0[core 3[hwt 0]]: >>>>>> [./././B/./././././././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],37] BITMAP 18 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],37][miriel026] TO >>>>>> socket 0[core 9[hwt 0]]: >>>>>> [./././././././././B/./.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],38] BITMAP 7 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],38][miriel026] TO >>>>>> socket 1[core 15[hwt 0]]: >>>>>> [./././././././././././.][./././B/./././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],39] BITMAP 19 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],39][miriel026] TO >>>>>> socket 1[core 21[hwt 0]]: >>>>>> [./././././././././././.][./././././././././B/./.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],40] BITMAP 8 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],40][miriel026] TO >>>>>> socket 0[core 4[hwt 0]]: >>>>>> [././././B/././././././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],41] BITMAP 20 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],41][miriel026] TO >>>>>> socket 0[core 10[hwt 0]]: >>>>>> [././././././././././B/.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],42] BITMAP 9 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],42][miriel026] TO >>>>>> socket 1[core 16[hwt 0]]: >>>>>> [./././././././././././.][././././B/././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],43] BITMAP 21 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],43][miriel026] TO >>>>>> socket 1[core 22[hwt 0]]: >>>>>> [./././././././././././.][././././././././././B/.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],44] BITMAP 10 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],44][miriel026] TO >>>>>> socket 0[core 5[hwt 0]]: >>>>>> [./././././B/./././././.][./././././././././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],45] BITMAP 22 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],45][miriel026] TO >>>>>> socket 0[core 11[hwt 0]]: >>>>>> [./././././././././././B][./././././././././././.] >>>>>> [miriel026:165125] mca:rmaps: compute bindings for job [29888,1] with >>>>>> policy CORE[4008] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],46] BITMAP 11 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],46][miriel026] TO >>>>>> socket 1[core 17[hwt 0]]: >>>>>> [./././././././././././.][./././././B/./././././.] >>>>>> [miriel025:62329] [[29888,0],1] GOT 1 CPUS >>>>>> [miriel025:62329] [[29888,0],1] PROC [[29888,1],47] BITMAP 23 >>>>>> [miriel025:62329] [[29888,0],1] BOUND PROC [[29888,1],47][miriel026] TO >>>>>> socket 1[core 23[hwt 0]]: >>>>>> [./././././././././././.][./././././././././././B] >>>>>> [miriel025:62342] MCW rank 0 not bound >>>>>> [miriel026:165138] MCW rank 24 not bound >>>>>> [miriel025:62343] MCW rank 1 not bound >>>>>> [miriel026:165139] MCW rank 25 not bound >>>>>> [miriel025:62344] MCW rank 2 not bound >>>>>> [miriel026:165140] MCW rank 26 not bound >>>>>> [miriel025:62345] MCW rank 3 not bound >>>>>> [miriel026:165141] MCW rank 27 not bound >>>>>> [miriel025:62346] MCW rank 4 not bound >>>>>> [miriel026:165142] MCW rank 28 not bound >>>>>> [miriel025:62348] MCW rank 5 not bound >>>>>> [miriel026:165144] MCW rank 29 not bound >>>>>> [miriel025:62350] MCW rank 6 not bound >>>>>> [miriel026:165146] MCW rank 30 not bound >>>>>> [miriel025:62352] MCW rank 7 not bound >>>>>> [miriel025:62354] MCW rank 8 not bound >>>>>> [miriel026:165148] MCW rank 31 not bound >>>>>> [miriel025:62356] MCW rank 9 not bound >>>>>> [miriel025:62358] MCW rank 10 not bound >>>>>> [miriel026:165150] MCW rank 32 not bound >>>>>> [miriel026:165152] MCW rank 33 not bound >>>>>> [miriel026:165154] MCW rank 34 not bound >>>>>> [miriel025:62360] MCW rank 11 not bound >>>>>> [miriel026:165156] MCW rank 35 not bound >>>>>> [miriel025:62362] MCW rank 12 not bound >>>>>> [miriel026:165158] MCW rank 36 not bound >>>>>> [miriel025:62366] MCW rank 14 not bound >>>>>> [miriel025:62364] MCW rank 13 not bound >>>>>> [miriel026:165160] MCW rank 37 not bound >>>>>> [miriel025:62368] MCW rank 15 not bound >>>>>> [miriel026:165163] MCW rank 39 not bound >>>>>> [miriel025:62370] MCW rank 16 not bound >>>>>> [miriel026:165166] MCW rank 40 not bound >>>>>> [miriel026:165167] MCW rank 41 not bound >>>>>> [miriel026:165162] MCW rank 38 not bound >>>>>> [miriel025:62372] MCW rank 17 not bound >>>>>> [miriel025:62374] MCW rank 18 not bound >>>>>> [miriel026:165170] MCW rank 42 not bound >>>>>> [miriel025:62376] MCW rank 19 not bound >>>>>> [miriel026:165171] MCW rank 43 not bound >>>>>> [miriel025:62377] MCW rank 20 not bound >>>>>> [miriel026:165173] MCW rank 44 not bound >>>>>> [miriel025:62380] MCW rank 21 not bound >>>>>> [miriel025:62381] MCW rank 22 not bound >>>>>> [miriel026:165177] MCW rank 46 not bound >>>>>> [miriel026:165175] MCW rank 45 not bound >>>>>> [miriel026:165179] MCW rank 47 not bound >>>>>> [miriel025:62384] MCW rank 23 not bound >>>>>> >>>>>> >>>>>> Le 13/04/2017 à 16:52, r...@open-mpi.org a écrit : >>>>>>> Okay, so you login node was able to figure out all the bindings. I >>>>>>> don’t see any debug output from your compute nodes, which is suspicious. >>>>>>> >>>>>>> Try adding --leave-session-attached to the cmd line and let’s see if we >>>>>>> can capture the compute node daemon’s output >>>>>>> >>>>>>>> On Apr 13, 2017, at 7:48 AM, Cyril Bordage <cyril.bord...@inria.fr> >>>>>>>> wrote: >>>>>>>> >>>>>>>> My machine file is: miriel025*24 miriel026*24 >>>>>>>> >>>>>>>> Le 13/04/2017 à 16:46, Cyril Bordage a écrit : >>>>>>>>> There is the output: >>>>>>>>> ############################################################################## >>>>>>>>> [devel11:80858] [[2965,0],0] rmaps:base set policy with NULL device >>>>>>>>> NONNULL >>>>>>>>> [devel11:80858] mca:rmaps:select: checking available component mindist >>>>>>>>> [devel11:80858] mca:rmaps:select: Querying component [mindist] >>>>>>>>> [devel11:80858] mca:rmaps:select: checking available component ppr >>>>>>>>> [devel11:80858] mca:rmaps:select: Querying component [ppr] >>>>>>>>> [devel11:80858] mca:rmaps:select: checking available component >>>>>>>>> rank_file >>>>>>>>> [devel11:80858] mca:rmaps:select: Querying component [rank_file] >>>>>>>>> [devel11:80858] mca:rmaps:select: checking available component >>>>>>>>> resilient >>>>>>>>> [devel11:80858] mca:rmaps:select: Querying component [resilient] >>>>>>>>> [devel11:80858] mca:rmaps:select: checking available component >>>>>>>>> round_robin >>>>>>>>> [devel11:80858] mca:rmaps:select: Querying component [round_robin] >>>>>>>>> [devel11:80858] mca:rmaps:select: checking available component seq >>>>>>>>> [devel11:80858] mca:rmaps:select: Querying component [seq] >>>>>>>>> [devel11:80858] [[2965,0],0]: Final mapper priorities >>>>>>>>> [devel11:80858] Mapper: ppr Priority: 90 >>>>>>>>> [devel11:80858] Mapper: seq Priority: 60 >>>>>>>>> [devel11:80858] Mapper: resilient Priority: 40 >>>>>>>>> [devel11:80858] Mapper: mindist Priority: 20 >>>>>>>>> [devel11:80858] Mapper: round_robin Priority: 10 >>>>>>>>> [devel11:80858] Mapper: rank_file Priority: 0 >>>>>>>>> [devel11:80858] mca:rmaps: mapping job [2965,1] >>>>>>>>> [devel11:80858] mca:rmaps: setting mapping policies for job [2965,1] >>>>>>>>> nprocs 48 >>>>>>>>> [devel11:80858] mca:rmaps[169] mapping not set by user - using bynuma >>>>>>>>> [devel11:80858] mca:rmaps:ppr: job [2965,1] not using ppr mapper PPR >>>>>>>>> NULL policy PPR NOTSET >>>>>>>>> [devel11:80858] [[2965,0],0] rmaps:seq called on job [2965,1] >>>>>>>>> [devel11:80858] mca:rmaps:seq: job [2965,1] not using seq mapper >>>>>>>>> [devel11:80858] mca:rmaps:resilient: cannot perform initial map of job >>>>>>>>> [2965,1] - no fault groups >>>>>>>>> [devel11:80858] mca:rmaps:mindist: job [2965,1] not using mindist >>>>>>>>> mapper >>>>>>>>> [devel11:80858] mca:rmaps:rr: mapping job [2965,1] >>>>>>>>> [devel11:80858] [[2965,0],0] Starting with 2 nodes in list >>>>>>>>> [devel11:80858] [[2965,0],0] Filtering thru apps >>>>>>>>> [devel11:80858] [[2965,0],0] Retained 2 nodes in list >>>>>>>>> [devel11:80858] [[2965,0],0] node miriel025 has 24 slots available >>>>>>>>> [devel11:80858] [[2965,0],0] node miriel026 has 24 slots available >>>>>>>>> [devel11:80858] AVAILABLE NODES FOR MAPPING: >>>>>>>>> [devel11:80858] node: miriel025 daemon: 1 >>>>>>>>> [devel11:80858] node: miriel026 daemon: 2 >>>>>>>>> [devel11:80858] [[2965,0],0] Starting bookmark at node miriel025 >>>>>>>>> [devel11:80858] [[2965,0],0] Starting at node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:rr: mapping no-span by NUMANode for job >>>>>>>>> [2965,1] slots 48 num_procs 48 >>>>>>>>> [devel11:80858] mca:rmaps:rr: found 4 NUMANode objects on node >>>>>>>>> miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:rr: calculated nprocs 24 >>>>>>>>> [devel11:80858] mca:rmaps:rr: assigning nprocs 24 >>>>>>>>> [devel11:80858] mca:rmaps:rr: found 4 NUMANode objects on node >>>>>>>>> miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:rr: calculated nprocs 24 >>>>>>>>> [devel11:80858] mca:rmaps:rr: assigning nprocs 24 >>>>>>>>> [devel11:80858] mca:rmaps:base: computing vpids by slot for job >>>>>>>>> [2965,1] >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 0 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 1 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 2 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 3 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 4 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 5 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 6 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 7 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 8 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 9 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 10 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 11 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 12 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 13 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 14 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 15 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 16 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 17 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 18 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 19 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 20 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 21 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 22 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 23 to node miriel025 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 24 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 25 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 26 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 27 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 28 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 29 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 30 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 31 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 32 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 33 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 34 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 35 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 36 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 37 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 38 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 39 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 40 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 41 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 42 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 43 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 44 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 45 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 46 to node miriel026 >>>>>>>>> [devel11:80858] mca:rmaps:base: assigning rank 47 to node miriel026 >>>>>>>>> [devel11:80858] [[2965,0],0] rmaps:base:compute_usage >>>>>>>>> [devel11:80858] mca:rmaps: compute bindings for job [2965,1] with >>>>>>>>> policy >>>>>>>>> CORE[4008] >>>>>>>>> [devel11:80858] [[2965,0],0] bind_depth: 6 map_depth 2 >>>>>>>>> [devel11:80858] mca:rmaps: bind downward for job [2965,1] with >>>>>>>>> bindings CORE >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],0] BITMAP 0 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],0][miriel025] TO >>>>>>>>> socket 0[core 0[hwt 0]]: >>>>>>>>> [B/././././././././././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],1] BITMAP 12 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],1][miriel025] TO >>>>>>>>> socket 0[core 6[hwt 0]]: >>>>>>>>> [././././././B/././././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],2] BITMAP 1 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],2][miriel025] TO >>>>>>>>> socket 1[core 12[hwt 0]]: >>>>>>>>> [./././././././././././.][B/././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],3] BITMAP 13 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],3][miriel025] TO >>>>>>>>> socket 1[core 18[hwt 0]]: >>>>>>>>> [./././././././././././.][././././././B/././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],4] BITMAP 2 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],4][miriel025] TO >>>>>>>>> socket 0[core 1[hwt 0]]: >>>>>>>>> [./B/./././././././././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],5] BITMAP 14 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],5][miriel025] TO >>>>>>>>> socket 0[core 7[hwt 0]]: >>>>>>>>> [./././././././B/./././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],6] BITMAP 3 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],6][miriel025] TO >>>>>>>>> socket 1[core 13[hwt 0]]: >>>>>>>>> [./././././././././././.][./B/./././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],7] BITMAP 15 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],7][miriel025] TO >>>>>>>>> socket 1[core 19[hwt 0]]: >>>>>>>>> [./././././././././././.][./././././././B/./././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],8] BITMAP 4 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],8][miriel025] TO >>>>>>>>> socket 0[core 2[hwt 0]]: >>>>>>>>> [././B/././././././././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],9] BITMAP 16 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],9][miriel025] TO >>>>>>>>> socket 0[core 8[hwt 0]]: >>>>>>>>> [././././././././B/././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],10] BITMAP 5 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],10][miriel025] TO >>>>>>>>> socket 1[core 14[hwt 0]]: >>>>>>>>> [./././././././././././.][././B/././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],11] BITMAP 17 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],11][miriel025] TO >>>>>>>>> socket 1[core 20[hwt 0]]: >>>>>>>>> [./././././././././././.][././././././././B/././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],12] BITMAP 6 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],12][miriel025] TO >>>>>>>>> socket 0[core 3[hwt 0]]: >>>>>>>>> [./././B/./././././././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],13] BITMAP 18 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],13][miriel025] TO >>>>>>>>> socket 0[core 9[hwt 0]]: >>>>>>>>> [./././././././././B/./.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],14] BITMAP 7 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],14][miriel025] TO >>>>>>>>> socket 1[core 15[hwt 0]]: >>>>>>>>> [./././././././././././.][./././B/./././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],15] BITMAP 19 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],15][miriel025] TO >>>>>>>>> socket 1[core 21[hwt 0]]: >>>>>>>>> [./././././././././././.][./././././././././B/./.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],16] BITMAP 8 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],16][miriel025] TO >>>>>>>>> socket 0[core 4[hwt 0]]: >>>>>>>>> [././././B/././././././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],17] BITMAP 20 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],17][miriel025] TO >>>>>>>>> socket 0[core 10[hwt 0]]: >>>>>>>>> [././././././././././B/.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],18] BITMAP 9 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],18][miriel025] TO >>>>>>>>> socket 1[core 16[hwt 0]]: >>>>>>>>> [./././././././././././.][././././B/././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],19] BITMAP 21 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],19][miriel025] TO >>>>>>>>> socket 1[core 22[hwt 0]]: >>>>>>>>> [./././././././././././.][././././././././././B/.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],20] BITMAP 10 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],20][miriel025] TO >>>>>>>>> socket 0[core 5[hwt 0]]: >>>>>>>>> [./././././B/./././././.][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],21] BITMAP 22 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],21][miriel025] TO >>>>>>>>> socket 0[core 11[hwt 0]]: >>>>>>>>> [./././././././././././B][./././././././././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],22] BITMAP 11 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],22][miriel025] TO >>>>>>>>> socket 1[core 17[hwt 0]]: >>>>>>>>> [./././././././././././.][./././././B/./././././.] >>>>>>>>> [devel11:80858] [[2965,0],0] GOT 1 CPUS >>>>>>>>> [devel11:80858] [[2965,0],0] PROC [[2965,1],23] BITMAP 23 >>>>>>>>> [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],23][miriel025] TO >>>>>>>>> socket 1[core 23[hwt 0]]: >>>>>>>>> [./././././././././././.][./././././././././././B] >>>>>>>>> [miriel025:60980] MCW rank 11 not bound >>>>>>>>> [miriel025:60990] MCW rank 21 not bound >>>>>>>>> [miriel025:60981] MCW rank 12 not bound >>>>>>>>> [miriel025:60979] MCW rank 10 not bound >>>>>>>>> [miriel025:60977] MCW rank 8 not bound >>>>>>>>> [miriel025:60970] MCW rank 1 not bound >>>>>>>>> [miriel025:60972] MCW rank 3 not bound >>>>>>>>> [miriel025:60984] MCW rank 15 not bound >>>>>>>>> [miriel026:163985] MCW rank 34 not bound >>>>>>>>> [miriel026:163993] MCW rank 42 not bound >>>>>>>>> [miriel026:163981] MCW rank 30 not bound >>>>>>>>> [miriel026:163983] MCW rank 32 not bound >>>>>>>>> [miriel025:60975] MCW rank 6 not bound >>>>>>>>> [miriel025:60986] MCW rank 17 not bound >>>>>>>>> [miriel025:60992] MCW rank 23 not bound >>>>>>>>> [miriel025:60973] MCW rank 4 not bound >>>>>>>>> [miriel025:60978] MCW rank 9 not bound >>>>>>>>> [miriel025:60969] MCW rank 0 not bound >>>>>>>>> [miriel025:60991] MCW rank 22 not bound >>>>>>>>> [miriel025:60974] MCW rank 5 not bound >>>>>>>>> [miriel025:60982] MCW rank 13 not bound >>>>>>>>> [miriel025:60989] MCW rank 20 not bound >>>>>>>>> [miriel025:60988] MCW rank 19 not bound >>>>>>>>> [miriel025:60983] MCW rank 14 not bound >>>>>>>>> [miriel025:60987] MCW rank 18 not bound >>>>>>>>> [miriel025:60976] MCW rank 7 not bound >>>>>>>>> [miriel026:163996] MCW rank 45 not bound >>>>>>>>> [miriel026:163979] MCW rank 28 not bound >>>>>>>>> [miriel026:163990] MCW rank 39 not bound >>>>>>>>> [miriel026:163976] MCW rank 25 not bound >>>>>>>>> [miriel026:163997] MCW rank 46 not bound >>>>>>>>> [miriel025:60971] MCW rank 2 not bound >>>>>>>>> [miriel026:163995] MCW rank 44 not bound >>>>>>>>> [miriel026:163987] MCW rank 36 not bound >>>>>>>>> [miriel026:163982] MCW rank 31 not bound >>>>>>>>> [miriel025:60985] MCW rank 16 not bound >>>>>>>>> [miriel026:163980] MCW rank 29 not bound >>>>>>>>> [miriel026:163975] MCW rank 24 not bound >>>>>>>>> [miriel026:163978] MCW rank 27 not bound >>>>>>>>> [miriel026:163992] MCW rank 41 not bound >>>>>>>>> [miriel026:163991] MCW rank 40 not bound >>>>>>>>> [miriel026:163998] MCW rank 47 not bound >>>>>>>>> [miriel026:163986] MCW rank 35 not bound >>>>>>>>> [miriel026:163984] MCW rank 33 not bound >>>>>>>>> [miriel026:163989] MCW rank 38 not bound >>>>>>>>> [miriel026:163994] MCW rank 43 not bound >>>>>>>>> [miriel026:163988] MCW rank 37 not bound >>>>>>>>> [miriel026:163977] MCW rank 26 not bound >>>>>>>>> ############################################################################## >>>>>>>>> >>>>>>>>> Le 13/04/2017 à 16:31, r...@open-mpi.org a écrit : >>>>>>>>>> Try adding "-mca rmaps_base_verbose 5” and see what that output >>>>>>>>>> tells us - I assume you have a debug build configured, yes (i.e., >>>>>>>>>> added --enable-debug to configure line)? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Apr 13, 2017, at 7:28 AM, Cyril Bordage <cyril.bord...@inria.fr> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> When I run this command from the compute node I have also that. But >>>>>>>>>>> not >>>>>>>>>>> when I run it from a login node (with the same machine file). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Cyril. >>>>>>>>>>> >>>>>>>>>>> Le 13/04/2017 à 16:22, r...@open-mpi.org a écrit : >>>>>>>>>>>> We are asking all these questions because we cannot replicate your >>>>>>>>>>>> problem - so we are trying to help you figure out what is >>>>>>>>>>>> different or missing from your machine. When I run your cmd line >>>>>>>>>>>> on my system, I get: >>>>>>>>>>>> >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 24 bound to socket 0[core 0[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [BB/../../../../../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 25 bound to socket 1[core 12[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][BB/../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 26 bound to socket 0[core 1[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../BB/../../../../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 27 bound to socket 1[core 13[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../BB/../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 28 bound to socket 0[core 2[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../BB/../../../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 29 bound to socket 1[core 14[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../BB/../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 30 bound to socket 0[core 3[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../BB/../../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 31 bound to socket 1[core 15[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../BB/../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 32 bound to socket 0[core 4[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../BB/../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 33 bound to socket 1[core 16[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../BB/../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 34 bound to socket 0[core 5[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../BB/../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 35 bound to socket 1[core 17[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../BB/../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 36 bound to socket 0[core 6[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../BB/../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 37 bound to socket 1[core 18[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../BB/../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 38 bound to socket 0[core 7[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../BB/../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 39 bound to socket 1[core 19[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../BB/../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 40 bound to socket 0[core 8[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../BB/../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 41 bound to socket 1[core 20[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../../BB/../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 42 bound to socket 0[core 9[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../BB/../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 43 bound to socket 1[core 21[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../../../BB/../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 44 bound to socket 0[core 10[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../BB/..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 45 bound to socket 1[core 22[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../../../../BB/..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 46 bound to socket 0[core 11[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../BB][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc002.cluster:55965] MCW rank 47 bound to socket 1[core 23[hwt >>>>>>>>>>>> 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../../../../../BB] >>>>>>>>>>>> [rhc001:197743] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: >>>>>>>>>>>> [BB/../../../../../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 1 bound to socket 1[core 12[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][BB/../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 2 bound to socket 0[core 1[hwt 0-1]]: >>>>>>>>>>>> [../BB/../../../../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 3 bound to socket 1[core 13[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../BB/../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 4 bound to socket 0[core 2[hwt 0-1]]: >>>>>>>>>>>> [../../BB/../../../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 5 bound to socket 1[core 14[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../BB/../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 6 bound to socket 0[core 3[hwt 0-1]]: >>>>>>>>>>>> [../../../BB/../../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 7 bound to socket 1[core 15[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../BB/../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 8 bound to socket 0[core 4[hwt 0-1]]: >>>>>>>>>>>> [../../../../BB/../../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 9 bound to socket 1[core 16[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../BB/../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 10 bound to socket 0[core 5[hwt 0-1]]: >>>>>>>>>>>> [../../../../../BB/../../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 11 bound to socket 1[core 17[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../BB/../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 12 bound to socket 0[core 6[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../BB/../../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 13 bound to socket 1[core 18[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../BB/../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 14 bound to socket 0[core 7[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../BB/../../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 15 bound to socket 1[core 19[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../BB/../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 16 bound to socket 0[core 8[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../BB/../../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 17 bound to socket 1[core 20[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../../BB/../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 18 bound to socket 0[core 9[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../BB/../..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 19 bound to socket 1[core 21[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../../../BB/../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 20 bound to socket 0[core 10[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../BB/..][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 21 bound to socket 1[core 22[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../../../../BB/..] >>>>>>>>>>>> [rhc001:197743] MCW rank 22 bound to socket 0[core 11[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../BB][../../../../../../../../../../../..] >>>>>>>>>>>> [rhc001:197743] MCW rank 23 bound to socket 1[core 23[hwt 0-1]]: >>>>>>>>>>>> [../../../../../../../../../../../..][../../../../../../../../../../../BB] >>>>>>>>>>>> >>>>>>>>>>>> Exactly as expected. You might check that you have libnuma and >>>>>>>>>>>> libnuma-devel installed >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On Apr 13, 2017, at 6:50 AM, gil...@rist.or.jp wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> OK thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> we've had some issues in the past when Open MPI assumed that the >>>>>>>>>>>>> (login) >>>>>>>>>>>>> node running mpirun has the same topology than the other >>>>>>>>>>>>> (compute) nodes. >>>>>>>>>>>>> i just wanted to clear this scenario. >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> >>>>>>>>>>>>> Gilles >>>>>>>>>>>>> >>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>> I am using the 6886c12 commit. >>>>>>>>>>>>>> I have no particular option for the configuration. >>>>>>>>>>>>>> I launch my application in the same way as I presented in my firt >>>>>>>>>>>>> email, >>>>>>>>>>>>>> there is the exact line: mpirun -np 48 -machinefile mf -bind-to >>>>>>>>>>>>>> core >>>>>>>>>>>>>> -report-bindings ./a.out >>>>>>>>>>>>>> >>>>>>>>>>>>>> lstopo does give the same output on both types on nodes. What is >>>>>>>>>>>>>> the >>>>>>>>>>>>>> purpose of that? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cyril. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Le 13/04/2017 à 15:24, gil...@rist.or.jp a écrit : >>>>>>>>>>>>>>> Also, can you please run >>>>>>>>>>>>>>> lstopo >>>>>>>>>>>>>>> on both your login and compute nodes ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Gilles >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>>>> Can you be a bit more specific? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - What version of Open MPI are you using? >>>>>>>>>>>>>>>> - How did you configure Open MPI? >>>>>>>>>>>>>>>> - How are you launching Open MPI applications? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Apr 13, 2017, at 9:08 AM, Cyril Bordage >>>>>>>>>>>>>>>>> <cyril.bord...@inria.fr >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> now this bug happens also when I launch my mpirun command >>>>>>>>>>>>>>>>> from the >>>>>>>>>>>>>>>>> compute node. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cyril. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Le 06/04/2017 à 05:38, r...@open-mpi.org a écrit : >>>>>>>>>>>>>>>>>> I believe this has been fixed now - please let me know >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mar 30, 2017, at 1:57 AM, Cyril Bordage >>>>>>>>>>>>>>>>>>> <cyril.bordage@inria. >>>>>>>>>>>>> fr >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am using the git version of MPI with "-bind-to core >>>>>>>>>>>>>>>>>>> -report- >>>>>>>>>>>>>>> bindings" >>>>>>>>>>>>>>>>>>> and I get that for all processes: >>>>>>>>>>>>>>>>>>> [miriel010:160662] MCW rank 0 not bound >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> When I use an old version I get: >>>>>>>>>>>>>>>>>>> [miriel010:44921] MCW rank 0 bound to socket 0[core 0[hwt >>>>>>>>>>>>>>>>>>> 0]]: >>>>>>>>>>>>>>>>>>> [B/././././././././././.][./././././././././././.] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> From git bisect the culprit seems to be: 48fc339 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> This bug happends only when I launch my mpirun command from >>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>> login node >>>>>>>>>>>>>>>>>>> and not >>>>>>>>>>>>>>>>>>> from a compute node. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Cyril. >>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Jeff Squyres >>>>>>>>>>>>>>>> jsquy...@cisco.com >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> devel mailing list >>>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> devel mailing list >>>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> devel@lists.open-mpi.org >>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> devel@lists.open-mpi.org >>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> devel@lists.open-mpi.org >>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>> _______________________________________________ >>>>> devel mailing list >>>>> devel@lists.open-mpi.org >>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>> >>>> _______________________________________________ >>>> devel mailing list >>>> devel@lists.open-mpi.org >>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>> _______________________________________________ >>> devel mailing list >>> devel@lists.open-mpi.org >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel > _______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel