Re: [gmx-users] Various questions related to Gromacs performance tuning

Kutzner, Carsten Sat, 28 Mar 2020 11:33:09 -0700


> Am 26.03.2020 um 17:00 schrieb Tobias Klöffel <tobias.kloef...@fau.de>:
> 
> Hi Carsten,
> 
> 
> On 3/24/20 9:02 PM, Kutzner, Carsten wrote:
>> Hi,
>> 
>>> Am 24.03.2020 um 16:28 schrieb Tobias Klöffel <tobias.kloef...@fau.de>:
>>> 
>>> Dear all,
>>> I am very new to Gromacs so maybe some of my problems are very easy to fix:)
>>> Currently I am trying to compile and benchmark gromacs on AMD rome cpus, 
>>> the benchmarks are taken from:
>>> https://www.mpibpc.mpg.de/grubmueller/bench
>>> 
>>> 1) OpenMP parallelization: Is it done via OpenMP tasks?
>> Yes, all over the code loops are parallelized via OpenMP via #pragma omp 
>> parallel for
>> and similar directives.
> Ok but that's not OpenMP tasking:)
>> 
>>> If the Intel toolchain is detected and  -DGMX_FFT_LIBRARY=mkl is 
>>> set,-mkl=serial is used, even though -DGMX_OPENMP=on is set.
>> GROMACS uses only the serial transposes - allowing mkl to open up its own 
>> OpenMP threads
>> would lead to oversubscription of cores and performance degradation.
> Ah I see. But then it should be noted somewhere in the docu that all FFTW/MKL 
> calls are inside a parallel region. Is there a specific reason for this? 
> Normally you can achieve much better performance if you call a threaded 
> library outside of a parallel region and let the library use its own threads.
>>> 2) I am trying to use gmx_mpi tune_pme but I never got it to run. I do not 
>>> really understand what I have to specify for -mdrun. I
>> Normally you need a serial (read: non-mpi enabled) 'gmx' so that you can call
>> gmx tune_pme. Most queueing systems don't like it if one parallel program 
>> calls
>> another parallel program.
>> 
>>> tried -mdrun 'gmx_mpi mdrun' and export MPIRUN="mpirun -use-hwthread-cpus 
>>> -np $tmpi -map-by ppr:$tnode:node:pe=$OMP_NUM_THREADS --report-bindings" 
>>> But it just complains that mdrun is not working.
>> There should be an output somewhere with the exact command line that
>> tune_pme invoked to test whether mdrun works. That should shed some light
>> on the issue.
>> 
>> Side note: Tuning is normally only useful on CPU-nodes. If your nodes also
>> have GPUs, you will probably not want to do this kind of PME tuning.
> Yes it's CPU only... I will tune pp:ppme procs manually. However, most of the 
> times it is failing with 'too large prime number' what is considered to be 
> 'too large'?
I think 2, 3, 5, 7, 11, and 13 and multiples of these are ok, but not larger 
prime numbers.
So for a fixed number of procs only some of the combinations PP:PME will 
actually work.
The ones that don't work would not be wise to choose from a performance point 
of view.


Best,
 Carsten

-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Various questions related to Gromacs performance tuning

Reply via email to