We can't seem to get "processor affinity" using 1.6.4 or newer OpenMPI.
Note this is a 2 socket machine with 8 cores per socket We had compiled OpenMPI 1.4.2 with the following configure options: =========================================================================== export CC=/apps/share/intel/v14.0.4.211/bin/icc export CXX=/apps/share/intel/v14.0.4.211/bin/icpc export FC=/apps/share/intel/v14.0.4.211/bin/ifort version=1.4.2.I1404211 ./configure \ --prefix=/apps/share/openmpi/$version \ --disable-shared \ --enable-static \ --enable-shared=no \ --with-openib \ --with-libnuma=/usr \ --enable-mpirun-prefix-by-default \ --with-memory-manager=none \ --with-tm=/apps/share/TORQUE/current/Linux =========================================================================== and then used this mpirun command (where we used 8 cores): =========================================================================== /apps/share/openmpi/1.4.2.I1404211/bin/mpirun \ --prefix /apps/share/openmpi/1.4.2.I1404211 \ --mca mpi_paffinity_alone 1 \ --mca btl openib,tcp,sm,self \ --x LD_LIBRARY_PATH \ {model args} =========================================================================== And when we checked the process map, it looks like this: PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 N5 ] 22232 prog1 0 469.9M [ 469.9M 0 0 0 0 0 ] 22233 prog1 1 479.0M [ 4.0M 475.0M 0 0 0 0 ] 22234 prog1 2 516.7M [ 516.7M 0 0 0 0 0 ] 22235 prog1 3 485.4M [ 8.0M 477.4M 0 0 0 0 ] 22236 prog1 4 482.6M [ 482.6M 0 0 0 0 0 ] 22237 prog1 5 486.6M [ 6.0M 480.6M 0 0 0 0 ] 22238 prog1 6 481.3M [ 481.3M 0 0 0 0 0 ] 22239 prog1 7 419.4M [ 8.0M 411.4M 0 0 0 0 ] Now with 1.6.4 and higher, we did the following: =========================================================================== export CC=/apps/share/intel/v14.0.4.211/bin/icc export CXX=/apps/share/intel/v14.0.4.211/bin/icpc export FC=/apps/share/intel/v14.0.4.211/bin/ifort version=1.6.4.I1404211 ./configure \ --disable-vt \ --prefix=/apps/share/openmpi/$version \ --disable-shared \ --enable-static \ --with-verbs \ --enable-mpirun-prefix-by-default \ --with-memory-manager=none \ --with-hwloc \ --enable-mpi-ext \ --with-tm=/apps/share/TORQUE/current/Linux =========================================================================== We've tried the same mpirun command, with -bind-to-core, with -bind-to-core -bycore etc and I can't seem to get the right combination of args to get the same behavior as 1.4.2. We get the following process map (this output is with mpirun args --bind-to-socket --mca mpi_paffinity_alone 1): PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 N5 ] 24176 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.2M [ 60.2M 0 0 0 0 0 ] 24177 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ 60.5M 0 0 0 0 0 ] 24178 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ 60.5M 0 0 0 0 0 ] 24179 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ 60.5M 0 0 0 0 0 ] 24180 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ 60.5M 0 0 0 0 0 ] 24181 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ 60.5M 0 0 0 0 0 ] 24182 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ 60.5M 0 0 0 0 0 ] 24183 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ 60.5M 0 0 0 0 0 ] here is the map using just --mca mpi_paffinity_alone 1 PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 N5 ] 25846 prog1 0,16 60.6M [ 60.6M 0 0 0 0 0 ] 25847 prog1 2,18 60.6M [ 60.6M 0 0 0 0 0 ] 25848 prog1 4,20 60.6M [ 60.6M 0 0 0 0 0 ] 25849 prog1 6,22 60.6M [ 60.6M 0 0 0 0 0 ] 25850 prog1 8,24 60.6M [ 60.6M 0 0 0 0 0 ] 25851 prog1 10,26 60.6M [ 60.6M 0 0 0 0 0 ] 25852 prog1 12,28 60.6M [ 60.6M 0 0 0 0 0 ] 25853 prog1 14,30 60.6M [ 60.6M 0 0 0 0 0 ] I figure I am compiling incorrectly or using the wrong mpirun args. Can someone tell me how to duplicate the behavior of 1.4.2 regarding binding the processes to cores? Any help appreciated.. thanks tom