Not good:
/labhome/alexm/workspace/openmpi-1.6.1a1hge06c2f2a0859/inst/bin/mpirun
--host
h-qa-017,h-qa-017,h-qa-017,h-qa-017,h-qa-018,h-qa-018,h-qa-018,h-qa-018 -np
8 --bind-to-core -bynode -display-map
/usr/mpi/gcc/mlnx-openmpi-1.6rc4/tests/osu_benchmarks-3.1.1/osu_alltoall
======================== JOB MAP ========================
Data for node: h-qa-017 Num procs: 4
Process OMPI jobid: [36855,1] Process rank: 0
Process OMPI jobid: [36855,1] Process rank: 2
Process OMPI jobid: [36855,1] Process rank: 4
Process OMPI jobid: [36855,1] Process rank: 6
Data for node: h-qa-018 Num procs: 4
Process OMPI jobid: [36855,1] Process rank: 1
Process OMPI jobid: [36855,1] Process rank: 3
Process OMPI jobid: [36855,1] Process rank: 5
Process OMPI jobid: [36855,1] Process rank: 7
=============================================================
--------------------------------------------------------------------------
An invalid physical processor ID was returned when attempting to bind
an MPI process to a unique processor.
This usually means that you requested binding to more processors than
exist (e.g., trying to bind N MPI processes to M processors, where N >
M). Double check that you have enough unique processors for all the
MPI processes that you are launching on this host.
$hwloc-ls --of console
Machine (32GB)
NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2 L#0 (256KB) +
L1 L#0 (32KB) + Core L#0
PU L#0 (P#0)
PU L#1 (P#2)
NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (20MB) + L2 L#1 (256KB) +
L1 L#1 (32KB) + Core L#1
PU L#2 (P#1)
PU L#3 (P#3)
On Tue, May 29, 2012 at 11:00 PM, Jeff Squyres <[email protected]> wrote:
> Per ticket #3108, there were still some unfortunate bugs in the affinity
> code in 1.6. :-(
>
> These have now been fixed. ...but since is the 2nd or 3rd time we have
> "fixed" the 1.5/1.6 series w.r.t. processor affinity, I'd really like
> people to test this stuff before it's committed and we ship 1.6.1. I've
> put tarballs containing the fixes here:
>
> http://www.open-mpi.org/~jsquyres/unofficial/
>
> Can you please try mpirun options like --bind-to-core and --bind-to-socket
> and ensure that they still work for you? (even on machines with
> hyperthreading enabled, if you have access to such things)
>
> IBM: I'd particularly like to hear that we haven't made anything worse on
> POWER systems. Thanks.
>
> --
> Jeff Squyres
> [email protected]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>