That should be a two steps tango
- Open MPI bind a MPI task to a socket
- the OpenMP runtime bind OpenMP threads to cores (or hyper threads) inside
the socket assigned by Open MPI

which compiler are you using ?
do you set some environment variables to direct OpenMP to bind threads ?

Also, how do you measure the hyperthread a given OpenMP thread is on ?
is it the hyperthread used at a given time ? If yes, then the thread might
migrate unless it was pinned by the OpenMP runtime.

If you are not sure, please post the source of your program so we can have
a look

Last but not least, as long as OpenMP threads are pinned to distinct cores,
you should not worry about them migrating between hyperthreads from the
same core.

Cheers,

Gilles

On Wednesday, April 12, 2017, Heinz-Ado Arnolds <arno...@mpa-garching.mpg.de>
wrote:

> Dear rhc,
>
> to make it more clear what I try to achieve, I collected some examples for
> several combinations of command line options. Would be great if you find
> time to look to these below. The most promise one is example "4".
>
> I'd like to have 4 MPI jobs starting 1 OpenMP job each with 10 threads,
> running on 2 nodes, each having 2 sockets, with 10 cores & 10 hwthreads.
> Only 10 cores (no hwthreads) should be used on each socket.
>
> 4 MPI -> 1 OpenMP with 10 thread (i.e. 4x10 threads)
> 2 nodes, 2 sockets each, 10 cores & 10 hwthreads each
>
> 1. mpirun -np 4 --map-by ppr:2:node --mca plm_rsh_agent "qrsh"
> -report-bindings ./myid
>
>    Machines  :
>    pascal-2-05...DE 20
>    pascal-1-03...DE 20
>
>    [pascal-2-05:28817] MCW rank 0 bound to socket 0[core 0[hwt 0-1]],
> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket
> 0[core 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/
> BB][../../../../../../../../../..]
>    [pascal-2-05:28817] MCW rank 1 bound to socket 1[core 10[hwt 0-1]],
> socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core
> 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]],
> socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core
> 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../..
> ][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
>    [pascal-1-03:19256] MCW rank 2 bound to socket 0[core 0[hwt 0-1]],
> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket
> 0[core 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/
> BB][../../../../../../../../../..]
>    [pascal-1-03:19256] MCW rank 3 bound to socket 1[core 10[hwt 0-1]],
> socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core
> 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]],
> socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core
> 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../..
> ][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
>    MPI Instance 0001 of 0004 is on pascal-2-05, Cpus_allowed_list:
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0001(pid
> 28833), 018, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0002(pid
> 28833), 014, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0003(pid
> 28833), 028, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0004(pid
> 28833), 012, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0005(pid
> 28833), 030, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0006(pid
> 28833), 016, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0007(pid
> 28833), 038, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0008(pid
> 28833), 034, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0009(pid
> 28833), 020, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0010(pid
> 28833), 022, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0002 of 0004 is on pascal-2-05, Cpus_allowed_list:
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0001(pid
> 28834), 007, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0002(pid
> 28834), 037, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0003(pid
> 28834), 039, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0004(pid
> 28834), 035, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0005(pid
> 28834), 031, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0006(pid
> 28834), 005, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0007(pid
> 28834), 027, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0008(pid
> 28834), 017, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0009(pid
> 28834), 019, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-2-05: MP thread  #0010(pid
> 28834), 029, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0003 of 0004 is on pascal-1-03, Cpus_allowed_list:
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0001(pid
> 19269), 012, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0002(pid
> 19269), 034, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0003(pid
> 19269), 008, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0004(pid
> 19269), 038, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0005(pid
> 19269), 032, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0006(pid
> 19269), 036, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0007(pid
> 19269), 020, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0008(pid
> 19269), 002, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0009(pid
> 19269), 004, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0010(pid
> 19269), 006, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0004 of 0004 is on pascal-1-03, Cpus_allowed_list:
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0001(pid
> 19268), 005, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0002(pid
> 19268), 029, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0003(pid
> 19268), 015, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0004(pid
> 19268), 007, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0005(pid
> 19268), 031, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0006(pid
> 19268), 013, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0007(pid
> 19268), 037, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0008(pid
> 19268), 039, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0009(pid
> 19268), 021, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0010(pid
> 19268), 023, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>
>    I get a distribution to 4 sockets on 2 nodes as expected, but cores and
> corresponding hwthreads are used simultaneously:
>    MPI Instance 0001 of 0004: MP thread  #0001 runs on CPU 018, MP thread
> #0007 runs on CPU 038,
>                               MP thread  #0002 runs on CPU 014, MP thread
> #0008 runs on CPU 034
>    according to "lscpu -a -e" CPUs 18/38 resp. 14/34 are the same physical
> cores
>
> 2. mpirun -np 4 --map-by ppr:2:node --use-hwthread-cpus -bind-to hwthread
> --mca plm_rsh_agent "qrsh" -report-bindings ./myid
>
>    Machines  :
>    pascal-1-05...DE 20
>    pascal-2-05...DE 20
>
>    I get this warning:
>
>      WARNING: a request was made to bind a process. While the system
>      supports binding the process itself, at least one node does NOT
>      support binding memory to the process location.
>
>        Node:  pascal-1-05
>
>      Open MPI uses the "hwloc" library to perform process and memory
>      binding. This error message means that hwloc has indicated that
>      processor binding support is not available on this machine.
>
>      On OS X, processor and memory binding is not available at all (i.e.,
>      the OS does not expose this functionality).
>
>      On Linux, lack of the functionality can mean that you are on a
>      platform where processor and memory affinity is not supported in Linux
>      itself, or that hwloc was built without NUMA and/or processor affinity
>      support. When building hwloc (which, depending on your Open MPI
>      installation, may be embedded in Open MPI itself), it is important to
>      have the libnuma header and library files available. Different linux
>      distributions package these files under different names; look for
>      packages with the word "numa" in them. You may also need a developer
>      version of the package (e.g., with "dev" or "devel" in the name) to
>      obtain the relevant header files.
>
>      If you are getting this message on a non-OS X, non-Linux platform,
>      then hwloc does not support processor / memory affinity on this
>      platform. If the OS/platform does actually support processor / memory
>      affinity, then you should contact the hwloc maintainers:
>      https://github.com/open-mpi/hwloc.
>
>      This is a warning only; your job will continue, though performance may
>      be degraded.
>
>    and these results:
>
>    [pascal-1-05:33175] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> [B./../../../../../../../../..][../../../../../../../../../..]
>    [pascal-1-05:33175] MCW rank 1 bound to socket 0[core 0[hwt 1]]:
> [.B/../../../../../../../../..][../../../../../../../../../..]
>    [pascal-2-05:28916] MCW rank 2 bound to socket 0[core 0[hwt 0]]:
> [B./../../../../../../../../..][../../../../../../../../../..]
>    [pascal-2-05:28916] MCW rank 3 bound to socket 0[core 0[hwt 1]]:
> [.B/../../../../../../../../..][../../../../../../../../../..]
>    MPI Instance 0001 of 0004 is on pascal-1-05, Cpus_allowed_list:      0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0001(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0002(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0003(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0004(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0005(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0006(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0007(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0008(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0009(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-05: MP thread  #0010(pid
> 33193), 000, Cpus_allowed_list:    0
>    MPI Instance 0002 of 0004 is on pascal-1-05, Cpus_allowed_list:      20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0001(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0002(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0003(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0004(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0005(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0006(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0007(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0008(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0009(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0002 of 0004 is on pascal-1-05: MP thread  #0010(pid
> 33192), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-2-05, Cpus_allowed_list:      0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0001(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0002(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0003(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0004(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0005(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0006(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0007(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0008(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0009(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0003 of 0004 is on pascal-2-05: MP thread  #0010(pid
> 28930), 000, Cpus_allowed_list:    0
>    MPI Instance 0004 of 0004 is on pascal-2-05, Cpus_allowed_list:      20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0001(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0002(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0003(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0004(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0005(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0006(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0007(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0008(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0009(pid
> 28929), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-2-05: MP thread  #0010(pid
> 28929), 020, Cpus_allowed_list:    20
>
>    Only 2 CPUs are used and these are the same physical cores.
>
> 3. mpirun -np 4 --use-hwthread-cpus -bind-to hwthread --mca plm_rsh_agent
> "qrsh" -report-bindings ./myid
>
>    Machines  :
>    pascal-1-03...DE 20
>    pascal-2-02...DE 20
>
>    I get a warning again:
>
>      WARNING: a request was made to bind a process. While the system
>      supports binding the process itself, at least one node does NOT
>      support binding memory to the process location.
>
>        Node:  pascal-1-03
>
>      Open MPI uses the "hwloc" library to perform process and memory
>      binding. This error message means that hwloc has indicated that
>      processor binding support is not available on this machine.
>
>      On OS X, processor and memory binding is not available at all (i.e.,
>      the OS does not expose this functionality).
>
>      On Linux, lack of the functionality can mean that you are on a
>      platform where processor and memory affinity is not supported in Linux
>      itself, or that hwloc was built without NUMA and/or processor affinity
>      support. When building hwloc (which, depending on your Open MPI
>      installation, may be embedded in Open MPI itself), it is important to
>      have the libnuma header and library files available. Different linux
>      distributions package these files under different names; look for
>      packages with the word "numa" in them. You may also need a developer
>      version of the package (e.g., with "dev" or "devel" in the name) to
>      obtain the relevant header files.
>
>      If you are getting this message on a non-OS X, non-Linux platform,
>      then hwloc does not support processor / memory affinity on this
>      platform. If the OS/platform does actually support processor / memory
>      affinity, then you should contact the hwloc maintainers:
>      https://github.com/open-mpi/hwloc.
>
>      This is a warning only; your job will continue, though performance may
>      be degraded.
>
>    and these results:
>
>    [pascal-1-03:19345] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> [B./../../../../../../../../..][../../../../../../../../../..]
>    [pascal-1-03:19345] MCW rank 1 bound to socket 1[core 10[hwt 0]]:
> [../../../../../../../../../..][B./../../../../../../../../..]
>    [pascal-1-03:19345] MCW rank 2 bound to socket 0[core 0[hwt 1]]:
> [.B/../../../../../../../../..][../../../../../../../../../..]
>    [pascal-1-03:19345] MCW rank 3 bound to socket 1[core 10[hwt 1]]:
> [../../../../../../../../../..][.B/../../../../../../../../..]
>    MPI Instance 0001 of 0004 is on pascal-1-03, Cpus_allowed_list:      0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0001(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0002(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0003(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0004(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0005(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0006(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0007(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0008(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0009(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0001 of 0004 is on pascal-1-03: MP thread  #0010(pid
> 19373), 000, Cpus_allowed_list:    0
>    MPI Instance 0002 of 0004 is on pascal-1-03, Cpus_allowed_list:      1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0001(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0002(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0003(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0004(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0005(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0006(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0007(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0008(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0009(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0002 of 0004 is on pascal-1-03: MP thread  #0010(pid
> 19372), 001, Cpus_allowed_list:    1
>    MPI Instance 0003 of 0004 is on pascal-1-03, Cpus_allowed_list:      20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0001(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0002(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0003(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0004(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0005(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0006(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0007(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0008(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0009(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0003 of 0004 is on pascal-1-03: MP thread  #0010(pid
> 19370), 020, Cpus_allowed_list:    20
>    MPI Instance 0004 of 0004 is on pascal-1-03, Cpus_allowed_list:      21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0001(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0002(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0003(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0004(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0005(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0006(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0007(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0008(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0009(pid
> 19371), 021, Cpus_allowed_list:    21
>    MPI Instance 0004 of 0004 is on pascal-1-03: MP thread  #0010(pid
> 19371), 021, Cpus_allowed_list:    21
>
>    The jobs are scheduled to one machine only.
>
> 4. mpirun -np 4 --map-by ppr:2:node --use-hwthread-cpus --mca
> plm_rsh_agent "qrsh" -report-bindings ./myid
>
>    Machines  :
>    pascal-1-00...DE 20
>    pascal-3-00...DE 20
>
>    [pascal-1-00:05867] MCW rank 0 bound to socket 0[core 0[hwt 0-1]],
> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket
> 0[core 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/
> BB][../../../../../../../../../..]
>    [pascal-1-00:05867] MCW rank 1 bound to socket 1[core 10[hwt 0-1]],
> socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core
> 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]],
> socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core
> 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../..
> ][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
>    [pascal-3-00:07501] MCW rank 2 bound to socket 0[core 0[hwt 0-1]],
> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket
> 0[core 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/
> BB][../../../../../../../../../..]
>    [pascal-3-00:07501] MCW rank 3 bound to socket 1[core 10[hwt 0-1]],
> socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core
> 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]],
> socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core
> 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../..
> ][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
>    MPI Instance 0001 of 0004 is on pascal-1-00, Cpus_allowed_list:
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0001(pid
> 05884), 034, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0002(pid
> 05884), 038, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0003(pid
> 05884), 002, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0004(pid
> 05884), 008, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0005(pid
> 05884), 036, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0006(pid
> 05884), 000, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0007(pid
> 05884), 004, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0008(pid
> 05884), 006, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0009(pid
> 05884), 030, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0010(pid
> 05884), 032, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0002 of 0004 is on pascal-1-00, Cpus_allowed_list:
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0001(pid
> 05883), 031, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0002(pid
> 05883), 017, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0003(pid
> 05883), 027, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0004(pid
> 05883), 039, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0005(pid
> 05883), 011, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0006(pid
> 05883), 033, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0007(pid
> 05883), 015, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0008(pid
> 05883), 021, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0009(pid
> 05883), 003, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0010(pid
> 05883), 025, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0003 of 0004 is on pascal-3-00, Cpus_allowed_list:
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0001(pid
> 07513), 016, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0002(pid
> 07513), 020, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0003(pid
> 07513), 022, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0004(pid
> 07513), 018, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0005(pid
> 07513), 012, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0006(pid
> 07513), 004, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0007(pid
> 07513), 008, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0008(pid
> 07513), 006, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0009(pid
> 07513), 030, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0010(pid
> 07513), 034, Cpus_allowed_list:    0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>    MPI Instance 0004 of 0004 is on pascal-3-00, Cpus_allowed_list:
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0001(pid
> 07514), 017, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0002(pid
> 07514), 025, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0003(pid
> 07514), 029, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0004(pid
> 07514), 003, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0005(pid
> 07514), 033, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0006(pid
> 07514), 001, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0007(pid
> 07514), 007, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0008(pid
> 07514), 039, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0009(pid
> 07514), 035, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>    MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0010(pid
> 07514), 031, Cpus_allowed_list:    1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
>
>    This distribution looks very well with this combination of options
> "--map-by ppr:2:node --use-hwthread-cpus", with one exception: looking at
> "MPI Instance 0002", you'll find that "MP thread  #0001" is executed on CPU
> 031, and "MP thread  #0005" is executed on CPU 011. 011/031 are the same
> physical core.
>    All others are real perfect! Is this error due to my fault or might
> their be a small remaining binding problem in OpenMPI?
>
> I'd appreciate any hint very much!
>
> Kind regards,
>
> Ado
>
> On 11.04.2017 01:36, r...@open-mpi.org <javascript:;> wrote:
> > I’m not entirely sure I understand your reference to “real cores”. When
> we bind you to a core, we bind you to all the HT’s that comprise that core.
> So, yes, with HT enabled, the binding report will list things by HT, but
> you’ll always be bound to the full core if you tell us bind-to core
> >
> > The default binding directive is bind-to socket when more than 2
> processes are in the job, and that’s what you are showing. You can override
> that by adding "-bind-to core" to your cmd line if that is what you desire.
> >
> > If you want to use individual HTs as independent processors, then
> “--use-hwthread-cpus -bind-to hwthreads” would indeed be the right
> combination.
> >
> >> On Apr 10, 2017, at 3:55 AM, Heinz-Ado Arnolds <
> arno...@mpa-garching.mpg.de <javascript:;>> wrote:
> >>
> >> Dear OpenMPI users & developers,
> >>
> >> I'm trying to distribute my jobs (with SGE) to a machine with a certain
> number of nodes, each node having 2 sockets, each socket having 10 cores &
> 10 hyperthreads. I like to use only the real cores, no hyperthreading.
> >>
> >> lscpu -a -e
> >>
> >> CPU NODE SOCKET CORE L1d:L1i:L2:L3
> >> 0   0    0      0    0:0:0:0
> >> 1   1    1      1    1:1:1:1
> >> 2   0    0      2    2:2:2:0
> >> 3   1    1      3    3:3:3:1
> >> 4   0    0      4    4:4:4:0
> >> 5   1    1      5    5:5:5:1
> >> 6   0    0      6    6:6:6:0
> >> 7   1    1      7    7:7:7:1
> >> 8   0    0      8    8:8:8:0
> >> 9   1    1      9    9:9:9:1
> >> 10  0    0      10   10:10:10:0
> >> 11  1    1      11   11:11:11:1
> >> 12  0    0      12   12:12:12:0
> >> 13  1    1      13   13:13:13:1
> >> 14  0    0      14   14:14:14:0
> >> 15  1    1      15   15:15:15:1
> >> 16  0    0      16   16:16:16:0
> >> 17  1    1      17   17:17:17:1
> >> 18  0    0      18   18:18:18:0
> >> 19  1    1      19   19:19:19:1
> >> 20  0    0      0    0:0:0:0
> >> 21  1    1      1    1:1:1:1
> >> 22  0    0      2    2:2:2:0
> >> 23  1    1      3    3:3:3:1
> >> 24  0    0      4    4:4:4:0
> >> 25  1    1      5    5:5:5:1
> >> 26  0    0      6    6:6:6:0
> >> 27  1    1      7    7:7:7:1
> >> 28  0    0      8    8:8:8:0
> >> 29  1    1      9    9:9:9:1
> >> 30  0    0      10   10:10:10:0
> >> 31  1    1      11   11:11:11:1
> >> 32  0    0      12   12:12:12:0
> >> 33  1    1      13   13:13:13:1
> >> 34  0    0      14   14:14:14:0
> >> 35  1    1      15   15:15:15:1
> >> 36  0    0      16   16:16:16:0
> >> 37  1    1      17   17:17:17:1
> >> 38  0    0      18   18:18:18:0
> >> 39  1    1      19   19:19:19:1
> >>
> >> How do I have to choose the options & parameters of mpirun to achieve
> this behavior?
> >>
> >> mpirun -np 4 --map-by ppr:2:node --mca plm_rsh_agent "qrsh"
> -report-bindings ./myid
> >>
> >> distributes to
> >>
> >> [pascal-1-04:35735] MCW rank 0 bound to socket 0[core 0[hwt 0-1]],
> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket
> 0[core 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/
> BB][../../../../../../../../../..]
> >> [pascal-1-04:35735] MCW rank 1 bound to socket 1[core 10[hwt 0-1]],
> socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core
> 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]],
> socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core
> 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../..
> ][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
> >> [pascal-1-03:00787] MCW rank 2 bound to socket 0[core 0[hwt 0-1]],
> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket
> 0[core 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/
> BB][../../../../../../../../../..]
> >> [pascal-1-03:00787] MCW rank 3 bound to socket 1[core 10[hwt 0-1]],
> socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core
> 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]],
> socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core
> 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../..
> ][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
> >> MPI Instance 0001 of 0004 is on pascal-1-04,pascal-1-04.MPA-
> Garching.MPG.DE, Cpus_allowed_list:      0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
> >> MPI Instance 0002 of 0004 is on pascal-1-04,pascal-1-04.MPA-
> Garching.MPG.DE, Cpus_allowed_list:      1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
> >> MPI Instance 0003 of 0004 is on pascal-1-03,pascal-1-03.MPA-
> Garching.MPG.DE, Cpus_allowed_list:      0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
> >> MPI Instance 0004 of 0004 is on pascal-1-03,pascal-1-03.MPA-
> Garching.MPG.DE, Cpus_allowed_list:      1,3,5,7,9,11,13,15,17,19,21,
> 23,25,27,29,31,33,35,37,39
> >>
> >> i.e.: 2 nodes: ok, 2 sockets: ok, different set of cores: ok, but uses
> all hwthreads
> >>
> >> I have tried several combinations of --use-hwthread-cpus, --bind-to
> hwthreads, but didn't find the right combination.
> >>
> >> Would be great to get any hints?
> >>
> >> Thank a lot in advance,
> >>
> >> Heinz-Ado Arnolds
> >> _______________________________________________
> >> users mailing list
> >> users@lists.open-mpi.org <javascript:;>
> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> >
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org <javascript:;>
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> >
> >
>
> Sehr geehrter Herr
>
> Schöne Grüße
>
> Ado
>
> Mit freundlichen Grüßen
>
> H.-A. Arnolds
>
> --
> ________________________________________________________________________
>
> Dipl.-Ing. Heinz-Ado Arnolds
>
> Max-Planck-Institut für Astrophysik
> Karl-Schwarzschild-Strasse 1
> D-85748 Garching
>
> Postfach 1317
> D-85741 Garching
>
> Phone   +49 89 30000-2217
> FAX     +49 89 30000-3240
> email   arnolds[at]MPA-Garching.MPG.DE
> ________________________________________________________________________
>
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to