Not good:
/labhome/alexm/workspace/openmpi-1.6.1a1hge06c2f2a0859/inst/bin/mpirun --host h-qa-017,h-qa-017,h-qa-017,h-qa-017,h-qa-018,h-qa-018,h-qa-018,h-qa-018 -np 8 --bind-to-core -bynode -display-map /usr/mpi/gcc/mlnx-openmpi-1.6rc4/tests/osu_benchmarks-3.1.1/osu_alltoall ======================== JOB MAP ======================== Data for node: h-qa-017 Num procs: 4 Process OMPI jobid: [36855,1] Process rank: 0 Process OMPI jobid: [36855,1] Process rank: 2 Process OMPI jobid: [36855,1] Process rank: 4 Process OMPI jobid: [36855,1] Process rank: 6 Data for node: h-qa-018 Num procs: 4 Process OMPI jobid: [36855,1] Process rank: 1 Process OMPI jobid: [36855,1] Process rank: 3 Process OMPI jobid: [36855,1] Process rank: 5 Process OMPI jobid: [36855,1] Process rank: 7 ============================================================= -------------------------------------------------------------------------- An invalid physical processor ID was returned when attempting to bind an MPI process to a unique processor. This usually means that you requested binding to more processors than exist (e.g., trying to bind N MPI processes to M processors, where N > M). Double check that you have enough unique processors for all the MPI processes that you are launching on this host. $hwloc-ls --of console Machine (32GB) NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#2) NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (20MB) + L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 PU L#2 (P#1) PU L#3 (P#3) On Tue, May 29, 2012 at 11:00 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > Per ticket #3108, there were still some unfortunate bugs in the affinity > code in 1.6. :-( > > These have now been fixed. ...but since is the 2nd or 3rd time we have > "fixed" the 1.5/1.6 series w.r.t. processor affinity, I'd really like > people to test this stuff before it's committed and we ship 1.6.1. I've > put tarballs containing the fixes here: > > http://www.open-mpi.org/~jsquyres/unofficial/ > > Can you please try mpirun options like --bind-to-core and --bind-to-socket > and ensure that they still work for you? (even on machines with > hyperthreading enabled, if you have access to such things) > > IBM: I'd particularly like to hear that we haven't made anything worse on > POWER systems. Thanks. > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >