Hi Szilard Thank you for answering. It did indeed show a significant improvement with, in particular,
$ gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 8 -ntomp 6 -npme 1 -gputasks 00000001 I also now understand better how to control each individual simulation. Your point on maximizing aggregate performance is well taken :-) Thanks again /PK 2018-02-09 16:25 GMT+01:00, Szilárd Páll <pall.szil...@gmail.com>: > Hi, > > First of all,have you read the docs (admittedly somewhat brief): > http://manual.gromacs.org/documentation/2018/user-guide/mdrun-performance.html#types-of-gpu-tasks > > The current PME GPU was optimized for single-GPU runs. Using multiple GPUs > with PME offloaded works, but this mode hasn't been an optimization target > and it will often not give very good performance. Using multiple GPUs > requires a separate PME rank (as you have realized), only one can be used > (as we don't support PME decomposition on GPUs) and it comes some inherent > scaling drawbacks. For this reason, unless you _need_ your single run to be > as fast as possible, you'll be better off running multiple simulations > side-by side. > > A few tips for tuning the performance of a multi-GPU run with PME offload: > * expect to get at best 1.5 scaling to 2 GPUs (rarely 3 if the tasks allow) > * generally it's best to use about the same decomposition that you'd use > with nonbonded-only offload, e.g. in your case 6-8 ranks > * map the GPU task alone or at most together with 1 PP rank to a GPU, i.e. > use the new -gputasks option > e.g. for your case I'd expect the following to work ~best: > gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 8 -ntomp 6 -npme 1 > -gputasks 00000001 > or > gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 8 -ntomp 6 -npme 1 > -gputasks 00000011 > > > Let me know if that gave some improvement. > > Cheers, > > -- > Szilárd > > On Fri, Feb 9, 2018 at 8:51 AM, Gmx QA <gmxquesti...@gmail.com> wrote: > >> Hi list, >> >> I am trying out the new gromacs 2018 (really nice so far), but have a few >> questions about what command line options I should specify, specifically >> with the new gnu pme implementation. >> >> My computer has two CPUs (with 12 cores each, 24 with hyper threading) and >> two GPUs, and I currently (with 2018) start simulations like this: >> >> $ gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 2 -npme 1 -ntomp 24 >> -gpu_id 01 >> >> this works, but gromacs prints the message that 24 omp threads per mpi >> rank >> is likely inefficient. However, trying to reduce the number of omp threads >> I see a reduction in performance. Is this message no longer relevant with >> gpu pme or am I overlooking something? >> >> Thanks >> /PK >> -- >> Gromacs Users mailing list >> >> * Please search the archive at http://www.gromacs.org/ >> Support/Mailing_Lists/GMX-Users_List before posting! >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> >> * For (un)subscribe requests visit >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or >> send a mail to gmx-users-requ...@gromacs.org. >> > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a > mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.