Your command line is incorrect: --map-by ppr:32:socket:PE=4 --bind-to hwthread
should be --map-by ppr:32:socket:PE=2 --bind-to core On Feb 28, 2021, at 5:57 AM, Luis Cebamanos via users <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > wrote: I should have said, "I would like to run 128 MPI processes on 2 nodes" and not 64 like I initially said... On Sat, 27 Feb 2021, 15:03 Luis Cebamanos, <luic...@gmail.com <mailto:luic...@gmail.com> > wrote: Hello OMPI users, On 128 core nodes, 2 sockets x 64 cores/socket (2 hwthreads/core) , I am trying to match the behavior of running with a rankfile with manual mapping/ranking/binding. I would like to run 64 MPI processes on 2 nodes, 1 MPI process every 2 cores. This is, I want to run 32 MPI processes per socket on 2 128-core nodes. My mapping should be something like: Node 0 ===== rank 0 - core 0 rank 1 - core 2 rank 3 - core 4 ... rank 63 - core 126 Node 1 ==== rank 64 - core 0 rank 65 - core 2 rank 66 - core 4 ... rank 127- core 126 If I use a rankfile: rank 0=epsilon102 slot=0 rank 1=epsilon102 slot=2 rank 2=epsilon102 slot=4 rank 3=epsilon102 slot=6 rank 4=epsilon102 slot=8 rank 5=epsilon102slot=10 .... rank 123=epsilon103 slot=118 rank 124=epsilon103 slot=120 rank 125=epsilon103 slot=122 rank 126=epsilon103 slot=124 rank 127=epsilon103 slot=126 My --report-binding looks like: [epsilon102:2635370] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: [BB/../../.. /../../../../../../../../../../../../../../../../../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../..][../../../../../../../../../../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../../../../../../../../../../../../..] [epsilon102:2635370] MCW rank 1 bound to socket 0[core 2[hwt 0-1]]: [../../BB/.. /../../../../../../../../../../../../../../../../../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../..][../../../../../../../../../../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../../../../../../../../../../../../..] [epsilon102:2635370] MCW rank 2 bound to socket 0[core 4[hwt 0-1]]: [../../../.. /BB/../../../../../../../../../../../../../../../../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../..][../../../../../../../../../../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../../../../../../../../../../../../..] However, I cannot match this report-binding output by manually using --map-by and --bind-to. I had the impression that this will be the same: mpirun -np $SLURM_NTASKS --report-bindings --map-by ppr:32:socket:PE=4 --bind-to hwthread But this output is not quite the same: [epsilon102:2631529] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[cor e 1[hwt 0-1]]: [BB/BB/../../../../../../../../../../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../../../../../../../../../../..][../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../../../../../../../../../../../../../../../../../../../../../..] [epsilon102:2631529] MCW rank 1 bound to socket 0[core 2[hwt 0-1]], socket 0[cor e 3[hwt 0-1]]: [../../BB/BB/../../../../../../../../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../../../../../../../../../../..][../../../../../../../../../../. ./../../../../../../../../../../../../../../../../../../../../../../../../../../ ../../../../../../../../../../../../../../../../../../../../../../../../../../..] What am I missing to match the rankfile behavior? Regarding performance, what difference does it make between the first and the second outputs? Thanks for your help! Luis