FWIW: I’ll be presenting “Mapping, Ranking, and Binding - Oh My!” at the OMPI BoF meeting at SC’16, for those who can attend
> On Oct 11, 2016, at 8:16 AM, Dave Love <d.l...@liverpool.ac.uk> wrote: > > Wirawan Purwanto <wiraw...@gmail.com> writes: > >> Instead of the scenario above, I was trying to get the MPI processes >> side-by-side (more like "fill_up" policy in SGE scheduler), i.e. fill >> node 0 first, then fill node 1, and so on. How do I do this properly? >> >> I tried a few attempts that fail: >> >> $ export OMP_NUM_THREADS=2 >> $ mpirun -np 16 -map-by core:PE=2 ./EXECUTABLE > > ... > >> Clearly I am not understanding how this map-by works. Could somebody >> help me? There was a wiki article partially written: >> >> https://github.com/open-mpi/ompi/wiki/ProcessPlacement >> >> but unfortunately it is also not clear to me. > > Me neither; this stuff has traditionally been quite unclear and really > needs documenting/explaining properly. > > This sort of thing from my local instructions for OMPI 1.8 probably does > what you want for OMP_NUM_THREADS=2 (where the qrsh options just get me > a couple of small nodes): > > $ qrsh -pe mpi 24 -l num_proc=12 \ > mpirun -n 12 --map-by slot:PE=2 --bind-to core --report-bindings true |& > sort -k 4 -n > [comp544:03093] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 0[core > 1[hwt 0]]: [B/B/./././.][./././././.] > [comp544:03093] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket 0[core > 3[hwt 0]]: [././B/B/./.][./././././.] > [comp544:03093] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket 0[core > 5[hwt 0]]: [././././B/B][./././././.] > [comp544:03093] MCW rank 3 bound to socket 1[core 6[hwt 0]], socket 1[core > 7[hwt 0]]: [./././././.][B/B/./././.] > [comp544:03093] MCW rank 4 bound to socket 1[core 8[hwt 0]], socket 1[core > 9[hwt 0]]: [./././././.][././B/B/./.] > [comp544:03093] MCW rank 5 bound to socket 1[core 10[hwt 0]], socket 1[core > 11[hwt 0]]: [./././././.][././././B/B] > [comp527:03056] MCW rank 6 bound to socket 0[core 0[hwt 0]], socket 0[core > 1[hwt 0]]: [B/B/./././.][./././././.] > [comp527:03056] MCW rank 7 bound to socket 0[core 2[hwt 0]], socket 0[core > 3[hwt 0]]: [././B/B/./.][./././././.] > [comp527:03056] MCW rank 8 bound to socket 0[core 4[hwt 0]], socket 0[core > 5[hwt 0]]: [././././B/B][./././././.] > [comp527:03056] MCW rank 9 bound to socket 1[core 6[hwt 0]], socket 1[core > 7[hwt 0]]: [./././././.][B/B/./././.] > [comp527:03056] MCW rank 10 bound to socket 1[core 8[hwt 0]], socket 1[core > 9[hwt 0]]: [./././././.][././B/B/./.] > [comp527:03056] MCW rank 11 bound to socket 1[core 10[hwt 0]], socket 1[core > 11[hwt 0]]: [./././././.][././././B/B] > > I don't remember how I found that out. > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users