Hi guys, I faced with an issue on our cluster related to mapping & binding policies on 1.8.5.
The matter is that --report-bindings output doesn't correspond to the locale. It looks like there is a mistake on the output itself, because it just puts serial core number while that core can be on another socket. For example, mpirun -np 2 --display-devel-map --report-bindings --map-by *socket* hostname Data for JOB [43064,1] offset 0 Mapper requested: NULL Last mapper: round_robin Mapping policy: BYSOCKET Ranking policy: SOCKET Binding policy: CORE Cpu set: NULL PPR: NULL Cpus-per-rank: 1 Num new daemons: 0 New daemon starting vpid INVALID Num nodes: 1 Data for node: clx-orion-001 Launch id: -1 State: 2 Daemon: [[43064,0],0] Daemon launched: True Num slots: 28 Slots in use: 2 Oversubscribed: FALSE Num slots allocated: 28 Max slots: 0 Username on node: NULL Num procs: 2 Next node_rank: 2 Data for proc: [[43064,1],0] Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 State: INITIALIZED Restarts: 0 App_context: 0 *Locale: 0-6,14-20* Bind location: 0 Binding: 0 Data for proc: [[43064,1],1] Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 State: INITIALIZED Restarts: 0 App_context: 0 *Locale: 7-13,21-27* Bind location: 7 Binding: 7 [clx-orion-001:26951] MCW rank 0 bound to socket 0[*core 0[*hwt 0]]: [B/././././././././././././.][./././././././././././././.] [clx-orion-001:26951] MCW rank 1 bound to socket 1[*core 14*[hwt 0]]: [./././././././././././././.][B/././././././././././././.] The second process should be bound at core 7 (not core 14). Another example: mpirun -np 8 --display-devel-map --report-bindings --map-by core hostname Data for JOB [43202,1] offset 0 Mapper requested: NULL Last mapper: round_robin Mapping policy: BYCORE Ranking policy: CORE Binding policy: CORE Cpu set: NULL PPR: NULL Cpus-per-rank: 1 Num new daemons: 0 New daemon starting vpid INVALID Num nodes: 1 Data for node: clx-orion-001 Launch id: -1 State: 2 Daemon: [[43202,0],0] Daemon launched: True Num slots: 28 Slots in use: 8 Oversubscribed: FALSE Num slots allocated: 28 Max slots: 0 Username on node: NULL Num procs: 8 Next node_rank: 8 Data for proc: [[43202,1],0] Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 State: INITIALIZED Restarts: 0 App_context: 0 Locale: 0 Bind location: 0 Binding: 0 Data for proc: [[43202,1],1] Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 State: INITIALIZED Restarts: 0 App_context: 0 Locale: 1 Bind location: 1 Binding: 1 Data for proc: [[43202,1],2] Pid: 0 Local rank: 2 Node rank: 2 App rank: 2 State: INITIALIZED Restarts: 0 App_context: 0 Locale: 2 Bind location: 2 Binding: 2 Data for proc: [[43202,1],3] Pid: 0 Local rank: 3 Node rank: 3 App rank: 3 State: INITIALIZED Restarts: 0 App_context: 0 Locale: 3 Bind location: 3 Binding: 3 Data for proc: [[43202,1],4] Pid: 0 Local rank: 4 Node rank: 4 App rank: 4 State: INITIALIZED Restarts: 0 App_context: 0 Locale: 4 Bind location: 4 Binding: 4 Data for proc: [[43202,1],5] Pid: 0 Local rank: 5 Node rank: 5 App rank: 5 State: INITIALIZED Restarts: 0 App_context: 0 Locale: 5 Bind location: 5 Binding: 5 Data for proc: [[43202,1],6] Pid: 0 Local rank: 6 Node rank: 6 App rank: 6 State: INITIALIZED Restarts: 0 App_context: 0 Locale: 6 Bind location: 6 Binding: 6 Data for proc: [[43202,1],7] Pid: 0 Local rank: 7 Node rank: 7 App rank: 7 State: INITIALIZED Restarts: 0 App_context: 0 *Locale: 14* Bind location: 14 Binding: 14 [clx-orion-001:27069] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././././././././.][./././././././././././././.] [clx-orion-001:27069] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./././././././././././.][./././././././././././././.] [clx-orion-001:27069] MCW rank 2 bound to socket 0[core 2[hwt 0]]: [././B/././././././././././.][./././././././././././././.] [clx-orion-001:27069] MCW rank 3 bound to socket 0[core 3[hwt 0]]: [./././B/./././././././././.][./././././././././././././.] [clx-orion-001:27069] MCW rank 4 bound to socket 0[core 4[hwt 0]]: [././././B/././././././././.][./././././././././././././.] [clx-orion-001:27069] MCW rank 5 bound to socket 0[core 5[hwt 0]]: [./././././B/./././././././.][./././././././././././././.] [clx-orion-001:27069] MCW rank 6 bound to socket 0[core 6[hwt 0]]: [././././././B/././././././.][./././././././././././././.] [clx-orion-001:27069] MCW rank 7 bound to socket 0[*core 7*[hwt 0]]: [./././././././B/./././././.][./././././././././././././.] Rank 7 should be bound at core 14 instead of core 7 since core 7 is at another socket. Best regards, Elena