Sure - use the 1.7 branch or the developer's trunk. We have the --bind-to numa 
option there.


On Feb 14, 2013, at 8:54 AM, Oliver Weihe <we...@deltacomputer.de> wrote:

> Hi, 
> 
> is it possible to bind MPI processes to a NUMA node somehow on Opteron 6xxx 
> series CPUs (e.g. --bind-to-NUMAnode) *without* the usage of a rankfile? 
> Opteron 6xxx have two NUMA nodes per CPU(-socket) so --bind-to-socket doesn't 
> work as I want. 
> 
> This is a 4 socket Opteron 6344 (12 CPUs per CPU(-socket)): 
> 
> root@node01:~> numactl --hardware | grep cpus 
> node 0 cpus: 0 1 2 3 4 5 
> node 1 cpus: 6 7 8 9 10 11 
> node 2 cpus: 12 13 14 15 16 17 
> node 3 cpus: 18 19 20 21 22 23 
> node 4 cpus: 24 25 26 27 28 29 
> node 5 cpus: 30 31 32 33 34 35 
> node 6 cpus: 36 37 38 39 40 41 
> node 7 cpus: 42 43 44 45 46 47 
> 
> root@node01:~> /opt/openmpi/1.6.3/gcc/bin/mpirun --report-bindings -np 8 
> --bind-to-socket --bysocket sleep 1s 
> [node01.cluster:21446] MCW rank 1 bound to socket 1[core 0-11]: [. . . . . . 
> . . . . . .][B B B B B B B B B B B B][. . . . . . . . . . . .][. . . . . . . 
> . . . . .] 
> [node01.cluster:21446] MCW rank 2 bound to socket 2[core 0-11]: [. . . . . . 
> . . . . . .][. . . . . . . . . . . .][B B B B B B B B B B B B][. . . . . . . 
> . . . . .] 
> [node01.cluster:21446] MCW rank 3 bound to socket 3[core 0-11]: [. . . . . . 
> . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B B 
> B B B B B] 
> [node01.cluster:21446] MCW rank 4 bound to socket 0[core 0-11]: [B B B B B B 
> B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . 
> . . . . .] 
> [node01.cluster:21446] MCW rank 5 bound to socket 1[core 0-11]: [. . . . . . 
> . . . . . .][B B B B B B B B B B B B][. . . . . . . . . . . .][. . . . . . . 
> . . . . .] 
> [node01.cluster:21446] MCW rank 6 bound to socket 2[core 0-11]: [. . . . . . 
> . . . . . .][. . . . . . . . . . . .][B B B B B B B B B B B B][. . . . . . . 
> . . . . .] 
> [node01.cluster:21446] MCW rank 7 bound to socket 3[core 0-11]: [. . . . . . 
> . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B B 
> B B B B B] 
> [node01.cluster:21446] MCW rank 0 bound to socket 0[core 0-11]: [B B B B B B 
> B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . 
> . . . . .] 
> 
> So each process is bound to *two* NUMA nodes, but I wan't to bind to *one* 
> NUMA node. 
> 
> What I want is more like this: 
> root@node01:~> cat rankfile 
> rank 0=localhost slot=0-5 
> rank 1=localhost slot=6-11 
> rank 2=localhost slot=12-17 
> rank 3=localhost slot=18-23 
> rank 4=localhost slot=24-29 
> rank 5=localhost slot=30-35 
> rank 6=localhost slot=36-41 
> rank 7=localhost slot=42-47 
> root@node01:~> /opt/openmpi/1.6.3/gcc/bin/mpirun --report-bindings -np 8 
> --rankfile rankfile sleep 1s 
> [node01.cluster:21505] MCW rank 1 bound to socket 0[core 6-11]: [. . . . . . 
> B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . 
> . . . . .] (slot list 6-11) 
> [node01.cluster:21505] MCW rank 2 bound to socket 1[core 0-5]: [. . . . . . . 
> . . . . .][B B B B B B . . . . . .][. . . . . . . . . . . .][. . . . . . . . 
> . . . .] (slot list 12-17) 
> [node01.cluster:21505] MCW rank 3 bound to socket 1[core 6-11]: [. . . . . . 
> . . . . . .][. . . . . . B B B B B B][. . . . . . . . . . . .][. . . . . . . 
> . . . . .] (slot list 18-23) 
> [node01.cluster:21505] MCW rank 4 bound to socket 2[core 0-5]: [. . . . . . . 
> . . . . .][. . . . . . . . . . . .][B B B B B B . . . . . .][. . . . . . . . 
> . . . .] (slot list 24-29) 
> [node01.cluster:21505] MCW rank 5 bound to socket 2[core 6-11]: [. . . . . . 
> . . . . . .][. . . . . . . . . . . .][. . . . . . B B B B B B][. . . . . . . 
> . . . . .] (slot list 30-35) 
> [node01.cluster:21505] MCW rank 6 bound to socket 3[core 0-5]: [. . . . . . . 
> . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B . . 
> . . . .] (slot list 36-41) 
> [node01.cluster:21505] MCW rank 7 bound to socket 3[core 6-11]: [. . . . . . 
> . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . B 
> B B B B B] (slot list 42-47) 
> [node01.cluster:21505] MCW rank 0 bound to socket 0[core 0-5]: [B B B B B B . 
> . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . 
> . . . .] (slot list 0-5) 
> 
> 
> Actually I'm dreaming of 
> mpirun --bind-to-NUMAnode --bycore ... 
> or 
> mpirun --bind-to-NUMAnode --byNUMAnode ... 
> 
> Is there any workaround execpt rankfiles for this? 
> 
> Regards, 
>  Oliver Weihe
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to