Your configure options look fine. Getting 1 process assigned to each core (irrespective of HT on or off):
—map-by core —bind-to core This will tight-pack the processes - i.e., they will be placed on each successive core. If you want to balance the load across the allocation (if the #procs < #cores in allocation): —map-by node —bind-to core HTH Ralph > On Apr 10, 2015, at 7:24 AM, Tom Wurgler <twu...@goodyear.com> wrote: > > Thanks for the responses. > > The idea is to bind one process per processor. The actual problem that > prompted the investigation is that a job > ran with 1.4.2 runs in 59 minutes and the same job in 1.6.4 and 1.8.4 takes > 79 minutes on the same machine, same compiler etc. In trying to track down > the reason for the run time differences, I found that the behavior is > different regarding the binding. Hence the question. > > I believe it is doing what we requested, but not what we want. The > bind-to-socket was just one attempt at making > it bind one per processor. I tried about 15 different combinations of the > mpirun args and none matched the behavior of 1.4.2 or the run time of 1.4.2 > and is a huge concern for us. > > I just checked this machine and hyperthreading is on. I can change that and > retest. > > Are my configure options ok for the 1.6.4+ configuring? > And what mpirun options should I be using to get 1 process per processor? > > This job was an 8 core test job, but the core counts varies per type of job > (and will be run on the big clusters, not this compile server). > > The run time differences show up across all our clusters, Intel based, AMD > based, various SuSE OS versions. > > thanks > tom > > From: devel <devel-boun...@open-mpi.org> on behalf of Ralph Castain > <r...@open-mpi.org> > Sent: Friday, April 10, 2015 9:54 AM > To: Open MPI Developers > Subject: Re: [OMPI devel] Assigning processes to cores 1.4.2, 1.6.4 and 1.8.4 > > Actually, I believe from the cmd line that the questioner wanted each process > to be bound to a single core. > > From your output, I’m guessing you have hyperthreads enabled on your system - > yes? In that case, the 1.4 series is likely to be binding each process to a > single HT because it isn’t sophisticated enough to realize the difference > between HT and core. > > Later versions of OMPI do know the difference. When you tell OMPI to bind to > core, it will bind you to -both- HTs of that core. Hence the output you > showed here: > >> here is the map using just --mca mpi_paffinity_alone 1 >> >> PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 >> N5 ] >> 25846 prog1 0,16 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25847 prog1 2,18 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25848 prog1 4,20 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25849 prog1 6,22 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25850 prog1 8,24 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25851 prog1 10,26 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25852 prog1 12,28 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25853 prog1 14,30 60.6M [ 60.6M 0 0 0 0 >> 0 ] > > > When you tell us bind-to socket, we bind you to every HT in that socket. > Since you are running less than 8 processes, and we map-by core by default, > all the processes are bound to the first socket. This is what you show in > this output: > >> We get the following process map (this output is with mpirun args >> --bind-to-socket >> --mca mpi_paffinity_alone 1): >> >> PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 >> N5 ] >> 24176 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.2M [ >> 60.2M 0 0 0 0 0 ] >> 24177 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24178 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24179 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24180 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24181 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24182 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24183 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] > > > So it looks to me like OMPI is doing exactly what you requested. I admit the > HT numbering in the cpumask is strange, but that’s the way your BIOS numbered > them. > > HTH > Ralph > > >> On Apr 10, 2015, at 6:29 AM, Nick Papior Andersen <nickpap...@gmail.com >> <mailto:nickpap...@gmail.com>> wrote: >> >> Bug, it should be "span,pe=2" >> >> 2015-04-10 15:28 GMT+02:00 Nick Papior Andersen <nickpap...@gmail.com >> <mailto:nickpap...@gmail.com>>: >> I guess you want process #1 to have core 0 and core 1 bound to it, process >> #2 have core 2 and core 3 bound? >> >> I can do this with (I do this with 1.8.4, I do not think it works with >> 1.6.x): >> --map-by ppr:4:socket:span:pe=2 >> ppr = processes per resource. >> socket = the resource >> span = load balance the processes >> pe = bind processing elements to each process >> >> This should launch 8 processes (you have 2 sockets). Each process should >> have 2 processing elements bound to it. >> You can check with --report-bindings to see the "bound" processes bindings. >> >> 2015-04-10 15:16 GMT+02:00 <twu...@goodyear.com >> <mailto:twu...@goodyear.com>>: >> >> We can't seem to get "processor affinity" using 1.6.4 or newer OpenMPI. >> >> Note this is a 2 socket machine with 8 cores per socket >> >> We had compiled OpenMPI 1.4.2 with the following configure options: >> >> =========================================================================== >> export CC=/apps/share/intel/v14.0.4.211/bin/icc >> export CXX=/apps/share/intel/v14.0.4.211/bin/icpc >> export FC=/apps/share/intel/v14.0.4.211/bin/ifort >> >> version=1.4.2.I1404211 >> >> ./configure \ >> --prefix=/apps/share/openmpi/$version \ >> --disable-shared \ >> --enable-static \ >> --enable-shared=no \ >> --with-openib \ >> --with-libnuma=/usr \ >> --enable-mpirun-prefix-by-default \ >> --with-memory-manager=none \ >> --with-tm=/apps/share/TORQUE/current/Linux >> =========================================================================== >> >> and then used this mpirun command (where we used 8 cores): >> >> =========================================================================== >> /apps/share/openmpi/1.4.2.I1404211/bin/mpirun \ >> --prefix /apps/share/openmpi/1.4.2.I1404211 \ >> --mca mpi_paffinity_alone 1 \ >> --mca btl openib,tcp,sm,self \ >> --x LD_LIBRARY_PATH \ >> {model args} >> =========================================================================== >> >> And when we checked the process map, it looks like this: >> >> PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 >> N5 ] >> 22232 prog1 0 469.9M [ 469.9M 0 0 0 0 >> 0 ] >> 22233 prog1 1 479.0M [ 4.0M 475.0M 0 0 0 >> 0 ] >> 22234 prog1 2 516.7M [ 516.7M 0 0 0 0 >> 0 ] >> 22235 prog1 3 485.4M [ 8.0M 477.4M 0 0 0 >> 0 ] >> 22236 prog1 4 482.6M [ 482.6M 0 0 0 0 >> 0 ] >> 22237 prog1 5 486.6M [ 6.0M 480.6M 0 0 0 >> 0 ] >> 22238 prog1 6 481.3M [ 481.3M 0 0 0 0 >> 0 ] >> 22239 prog1 7 419.4M [ 8.0M 411.4M 0 0 0 >> 0 ] >> >> Now with 1.6.4 and higher, we did the following: >> =========================================================================== >> export CC=/apps/share/intel/v14.0.4.211/bin/icc >> export CXX=/apps/share/intel/v14.0.4.211/bin/icpc >> export FC=/apps/share/intel/v14.0.4.211/bin/ifort >> >> version=1.6.4.I1404211 >> >> ./configure \ >> --disable-vt \ >> --prefix=/apps/share/openmpi/$version \ >> --disable-shared \ >> --enable-static \ >> --with-verbs \ >> --enable-mpirun-prefix-by-default \ >> --with-memory-manager=none \ >> --with-hwloc \ >> --enable-mpi-ext \ >> --with-tm=/apps/share/TORQUE/current/Linux >> =========================================================================== >> >> We've tried the same mpirun command, with -bind-to-core, with -bind-to-core >> -bycore etc >> and I can't seem to get the right combination of args to get the same >> behavior as 1.4.2. >> >> We get the following process map (this output is with mpirun args >> --bind-to-socket >> --mca mpi_paffinity_alone 1): >> >> PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 >> N5 ] >> 24176 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.2M [ >> 60.2M 0 0 0 0 0 ] >> 24177 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24178 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24179 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24180 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24181 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24182 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> 24183 prog1 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 60.5M [ >> 60.5M 0 0 0 0 0 ] >> >> here is the map using just --mca mpi_paffinity_alone 1 >> >> PID COMMAND CPUMASK TOTAL [ N0 N1 N2 N3 N4 >> N5 ] >> 25846 prog1 0,16 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25847 prog1 2,18 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25848 prog1 4,20 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25849 prog1 6,22 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25850 prog1 8,24 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25851 prog1 10,26 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25852 prog1 12,28 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> 25853 prog1 14,30 60.6M [ 60.6M 0 0 0 0 >> 0 ] >> >> I figure I am compiling incorrectly or using the wrong mpirun args. >> >> Can someone tell me how to duplicate the behavior of 1.4.2 regarding binding >> the processes to cores? >> >> Any help appreciated.. >> >> thanks >> >> tom >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/04/17205.php >> <http://www.open-mpi.org/community/lists/devel/2015/04/17205.php> >> >> >> >> -- >> Kind regards Nick >> >> >> >> -- >> Kind regards Nick >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/04/17207.php >> <http://www.open-mpi.org/community/lists/devel/2015/04/17207.php> > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/04/17209.php > <http://www.open-mpi.org/community/lists/devel/2015/04/17209.php>