FWIW: I’ll be presenting “Mapping, Ranking, and Binding - Oh My!” at the OMPI 
BoF meeting at SC’16, for those who can attend


> On Oct 11, 2016, at 8:16 AM, Dave Love <d.l...@liverpool.ac.uk> wrote:
> 
> Wirawan Purwanto <wiraw...@gmail.com> writes:
> 
>> Instead of the scenario above, I was trying to get the MPI processes
>> side-by-side (more like "fill_up" policy in SGE scheduler), i.e. fill
>> node 0 first, then fill node 1, and so on. How do I do this properly?
>> 
>> I tried a few attempts that fail:
>> 
>> $ export OMP_NUM_THREADS=2
>> $ mpirun -np 16 -map-by core:PE=2 ./EXECUTABLE
> 
> ...
> 
>> Clearly I am not understanding how this map-by works. Could somebody
>> help me? There was a wiki article partially written:
>> 
>> https://github.com/open-mpi/ompi/wiki/ProcessPlacement
>> 
>> but unfortunately it is also not clear to me.
> 
> Me neither; this stuff has traditionally been quite unclear and really
> needs documenting/explaining properly.
> 
> This sort of thing from my local instructions for OMPI 1.8 probably does
> what you want for OMP_NUM_THREADS=2 (where the qrsh options just get me
> a couple of small nodes):
> 
>  $ qrsh -pe mpi 24 -l num_proc=12 \
>     mpirun -n 12 --map-by slot:PE=2 --bind-to core --report-bindings true |&
>     sort -k 4 -n
>  [comp544:03093] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 0[core 
> 1[hwt 0]]: [B/B/./././.][./././././.]
>  [comp544:03093] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket 0[core 
> 3[hwt 0]]: [././B/B/./.][./././././.]
>  [comp544:03093] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket 0[core 
> 5[hwt 0]]: [././././B/B][./././././.]
>  [comp544:03093] MCW rank 3 bound to socket 1[core 6[hwt 0]], socket 1[core 
> 7[hwt 0]]: [./././././.][B/B/./././.]
>  [comp544:03093] MCW rank 4 bound to socket 1[core 8[hwt 0]], socket 1[core 
> 9[hwt 0]]: [./././././.][././B/B/./.]
>  [comp544:03093] MCW rank 5 bound to socket 1[core 10[hwt 0]], socket 1[core 
> 11[hwt 0]]: [./././././.][././././B/B]
>  [comp527:03056] MCW rank 6 bound to socket 0[core 0[hwt 0]], socket 0[core 
> 1[hwt 0]]: [B/B/./././.][./././././.]
>  [comp527:03056] MCW rank 7 bound to socket 0[core 2[hwt 0]], socket 0[core 
> 3[hwt 0]]: [././B/B/./.][./././././.]
>  [comp527:03056] MCW rank 8 bound to socket 0[core 4[hwt 0]], socket 0[core 
> 5[hwt 0]]: [././././B/B][./././././.]
>  [comp527:03056] MCW rank 9 bound to socket 1[core 6[hwt 0]], socket 1[core 
> 7[hwt 0]]: [./././././.][B/B/./././.]
>  [comp527:03056] MCW rank 10 bound to socket 1[core 8[hwt 0]], socket 1[core 
> 9[hwt 0]]: [./././././.][././B/B/./.]
>  [comp527:03056] MCW rank 11 bound to socket 1[core 10[hwt 0]], socket 1[core 
> 11[hwt 0]]: [./././././.][././././B/B]
> 
> I don't remember how I found that out.
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to