Re: [OMPI devel] Assigning processes to cores 1.4.2, 1.6.4 and 1.8.4

Ralph Castain Fri, 10 Apr 2015 11:38:57 -0400 (EDT)

Your configure options look fine.

Getting 1 process assigned to each core (irrespective of HT on or off):


—map-by core —bind-to core

This will tight-pack the processes - i.e., they will be placed on each 
successive core. If you want to balance the load across the allocation (if the 
#procs < #cores in allocation):

—map-by node —bind-to core

HTH
Ralph


> On Apr 10, 2015, at 7:24 AM, Tom Wurgler <twu...@goodyear.com> wrote:
> 
> Thanks for the responses.  
> 
> The idea is to bind one process per processor.  The actual problem that 
> prompted the investigation is that a job
> ran with 1.4.2 runs in 59 minutes and the same job in 1.6.4 and 1.8.4 takes 
> 79 minutes on the same machine, same compiler etc.  In trying to track down 
> the reason for the run time differences, I found that the behavior is 
> different regarding the binding.  Hence the question.
> 
> I believe it is doing what we requested, but not what we want.  The 
> bind-to-socket was just one attempt at making
> it bind one per processor.  I tried about 15 different combinations of the 
> mpirun args and none matched the behavior of 1.4.2 or the run time of 1.4.2 
> and is a huge concern for us.
> 
> I just checked this machine and hyperthreading is on.  I can change that and 
> retest.
> 
> Are my configure options ok for the 1.6.4+ configuring?
> And what mpirun options should I be using to get 1 process per processor?
> 
> This job was an 8 core test job, but the core counts varies per type of job 
> (and will be run on the big clusters, not this compile server).
> 
> The run time differences show up across all our clusters, Intel based, AMD 
> based, various SuSE OS versions.
> 
> thanks
> tom
> 
> From: devel <devel-boun...@open-mpi.org> on behalf of Ralph Castain 
> <r...@open-mpi.org>
> Sent: Friday, April 10, 2015 9:54 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] Assigning processes to cores 1.4.2, 1.6.4 and 1.8.4
>  
> Actually, I believe from the cmd line that the questioner wanted each process 
> to be bound to a single core.
> 
> From your output, I’m guessing you have hyperthreads enabled on your system - 
> yes? In that case, the 1.4 series is likely to be binding each process to a 
> single HT because it isn’t sophisticated enough to realize the difference 
> between HT and core.
> 
> Later versions of OMPI do know the difference. When you tell OMPI to bind to 
> core, it will bind you to -both- HTs of that core. Hence the output you 
> showed here:
> 
>> here is the map using just --mca mpi_paffinity_alone 1
>> 
>>   PID COMMAND         CPUMASK     TOTAL [     N0     N1     N2     N3     N4 
>>     N5 ]
>> 25846 prog1              0,16     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25847 prog1              2,18     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25848 prog1              4,20     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25849 prog1              6,22     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25850 prog1              8,24     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25851 prog1             10,26     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25852 prog1             12,28     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25853 prog1             14,30     60.6M [  60.6M     0      0      0      0  
>>     0  ]
> 
> 
> When you tell us bind-to socket, we bind you to every HT in that socket. 
> Since you are running less than 8 processes, and we map-by core by default, 
> all the processes are bound to the first socket. This is what you show in 
> this output:
> 
>> We get the following process map (this output is with mpirun args 
>> --bind-to-socket
>> --mca mpi_paffinity_alone 1):
>> 
>>   PID COMMAND         CPUMASK     TOTAL [     N0     N1     N2     N3     N4 
>>     N5 ]
>> 24176 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.2M [ 
>>  60.2M     0      0      0      0      0  ]
>> 24177 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24178 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24179 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24180 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24181 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24182 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24183 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
> 
> 
> So it looks to me like OMPI is doing exactly what you requested. I admit the 
> HT numbering in the cpumask is strange, but that’s the way your BIOS numbered 
> them.
> 
> HTH
> Ralph
> 
> 
>> On Apr 10, 2015, at 6:29 AM, Nick Papior Andersen <nickpap...@gmail.com 
>> <mailto:nickpap...@gmail.com>> wrote:
>> 
>> Bug, it should be "span,pe=2"
>> 
>> 2015-04-10 15:28 GMT+02:00 Nick Papior Andersen <nickpap...@gmail.com 
>> <mailto:nickpap...@gmail.com>>:
>> I guess you want process #1 to have core 0 and core 1 bound to it, process 
>> #2 have core 2 and core 3 bound?
>> 
>> I can do this with (I do this with 1.8.4, I do not think it works with 
>> 1.6.x):
>> --map-by ppr:4:socket:span:pe=2
>> ppr = processes per resource.
>> socket = the resource
>> span = load balance the processes
>> pe = bind processing elements to each process
>> 
>> This should launch 8 processes (you have 2 sockets). Each process should 
>> have 2 processing elements bound to it.
>> You can check with --report-bindings to see the "bound" processes bindings.
>> 
>> 2015-04-10 15:16 GMT+02:00  <twu...@goodyear.com 
>> <mailto:twu...@goodyear.com>>:
>> 
>> We can't seem to get "processor affinity" using 1.6.4 or newer OpenMPI.
>> 
>> Note this is a 2 socket machine with 8 cores per socket
>> 
>> We had compiled OpenMPI 1.4.2 with the following configure options:
>> 
>> ===========================================================================
>> export CC=/apps/share/intel/v14.0.4.211/bin/icc
>> export CXX=/apps/share/intel/v14.0.4.211/bin/icpc
>> export FC=/apps/share/intel/v14.0.4.211/bin/ifort
>> 
>> version=1.4.2.I1404211
>> 
>> ./configure \
>>     --prefix=/apps/share/openmpi/$version \
>>     --disable-shared \
>>     --enable-static \
>>     --enable-shared=no \
>>     --with-openib \
>>     --with-libnuma=/usr \
>>     --enable-mpirun-prefix-by-default \
>>     --with-memory-manager=none \
>>     --with-tm=/apps/share/TORQUE/current/Linux
>> ===========================================================================
>> 
>> and then used this mpirun command (where we used 8 cores):
>> 
>> ===========================================================================
>> /apps/share/openmpi/1.4.2.I1404211/bin/mpirun \
>> --prefix /apps/share/openmpi/1.4.2.I1404211 \
>> --mca mpi_paffinity_alone 1 \
>> --mca btl openib,tcp,sm,self \
>> --x LD_LIBRARY_PATH \
>> {model args}
>> ===========================================================================
>> 
>> And when we checked the process map, it looks like this:
>> 
>>   PID COMMAND         CPUMASK     TOTAL [     N0     N1     N2     N3     N4 
>>     N5 ]
>> 22232 prog1                 0    469.9M [ 469.9M     0      0      0      0  
>>     0  ]
>> 22233 prog1                 1    479.0M [   4.0M 475.0M     0      0      0  
>>     0  ]
>> 22234 prog1                 2    516.7M [ 516.7M     0      0      0      0  
>>     0  ]
>> 22235 prog1                 3    485.4M [   8.0M 477.4M     0      0      0  
>>     0  ]
>> 22236 prog1                 4    482.6M [ 482.6M     0      0      0      0  
>>     0  ]
>> 22237 prog1                 5    486.6M [   6.0M 480.6M     0      0      0  
>>     0  ]
>> 22238 prog1                 6    481.3M [ 481.3M     0      0      0      0  
>>     0  ]
>> 22239 prog1                 7    419.4M [   8.0M 411.4M     0      0      0  
>>     0  ]
>> 
>> Now with 1.6.4 and higher, we did the following:
>> ===========================================================================
>> export CC=/apps/share/intel/v14.0.4.211/bin/icc
>> export CXX=/apps/share/intel/v14.0.4.211/bin/icpc
>> export FC=/apps/share/intel/v14.0.4.211/bin/ifort
>> 
>> version=1.6.4.I1404211
>> 
>> ./configure \
>>     --disable-vt \
>>     --prefix=/apps/share/openmpi/$version \
>>     --disable-shared \
>>     --enable-static \
>>     --with-verbs \
>>     --enable-mpirun-prefix-by-default \
>>     --with-memory-manager=none \
>>     --with-hwloc \
>>     --enable-mpi-ext \
>>     --with-tm=/apps/share/TORQUE/current/Linux
>> ===========================================================================
>> 
>> We've tried the same mpirun command, with -bind-to-core, with -bind-to-core 
>> -bycore etc
>> and I can't seem to get the right combination of args to get the same 
>> behavior as 1.4.2.
>> 
>> We get the following process map (this output is with mpirun args 
>> --bind-to-socket
>> --mca mpi_paffinity_alone 1):
>> 
>>   PID COMMAND         CPUMASK     TOTAL [     N0     N1     N2     N3     N4 
>>     N5 ]
>> 24176 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.2M [ 
>>  60.2M     0      0      0      0      0  ]
>> 24177 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24178 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24179 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24180 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24181 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24182 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 24183 prog1           0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30     60.5M [ 
>>  60.5M     0      0      0      0      0  ]
>> 
>> here is the map using just --mca mpi_paffinity_alone 1
>> 
>>   PID COMMAND         CPUMASK     TOTAL [     N0     N1     N2     N3     N4 
>>     N5 ]
>> 25846 prog1              0,16     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25847 prog1              2,18     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25848 prog1              4,20     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25849 prog1              6,22     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25850 prog1              8,24     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25851 prog1             10,26     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25852 prog1             12,28     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 25853 prog1             14,30     60.6M [  60.6M     0      0      0      0  
>>     0  ]
>> 
>> I figure I am compiling incorrectly or using the wrong mpirun args.
>> 
>> Can someone tell me how to duplicate the behavior of 1.4.2 regarding binding 
>> the processes to cores?
>> 
>> Any help appreciated..
>> 
>> thanks
>> 
>> tom
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/04/17205.php 
>> <http://www.open-mpi.org/community/lists/devel/2015/04/17205.php>
>> 
>> 
>> 
>> -- 
>> Kind regards Nick
>> 
>> 
>> 
>> -- 
>> Kind regards Nick
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/04/17207.php 
>> <http://www.open-mpi.org/community/lists/devel/2015/04/17207.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/04/17209.php 
> <http://www.open-mpi.org/community/lists/devel/2015/04/17209.php>

Re: [OMPI devel] Assigning processes to cores 1.4.2, 1.6.4 and 1.8.4

Reply via email to