Re: [gmx-users] 2018 performance question
<pall.szil...@gmail.com> To: Michael Brunsteiner <mbx0...@yahoo.com> Sent: Thursday, February 22, 2018 4:15 PM Subject: Re: [gmx-users] 2018 performance question Hi, What I meant is _not_ that you should scale the GPU-accelerated GROMACS across multiple GPUs -- the scaling efficiency is not great and depends a lot on the system size. What I meant instead is that the GROMACS 2018 release now requires fewer cores/GPU to get near peak performance (in single-GPU mode) and therefore, whereas you may not see a lot of improvement if you just keep using the 6 cores/GPU in your machine, you generally should be able to run 2-3 simulations side-by-side, each on a separate GPU, but each using only 2-3 cores. Cheers, -- Szilárd On Thu, Feb 22, 2018 at 9:59 AM, Michael Brunsteiner <mbx0...@yahoo.com> wrote: > From: Szilárd Páll <pall.szil...@gmail.com> > > > To: Michael Brunsteiner <mbx0...@yahoo.com> > Sent: Wednesday, February 21, 2018 9:16 PM > Subject: Re: [gmx-users] 2018 performance question > >> Hi Michael, >> >> Why not use both GPUs, you should be able to get up to 80% performance >> on just 3 of the 6 cores. > > its a bit more complicated. i have 4 machines with identical hardware > but one of the 4 graphic cards started giving me seg-faults lately, > some of the RAM must be broken there ... > so i need to replace that in any case ... but if you suggest that gmx > can use two different graphics cards (780 and 1060) simultaneously I'll > certainly give that a try on the 3 others. > thanks for your help! > regards, > Michael > > > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] 2018 performance question
Hi Michael, What you observe is most likely due to v2018 by default shifting the PME work to the GPU which will often mean fewer CPU cores are needed and runs become more GPU-bound leaving the CPU without work for part of the runtime. This should be easily seen by comparing the log files. Especially with older GPUs (approx >2 gen old Kepler or earlier) running only part of the PME work on the GPU can be useful. This can be done by using the hybrid PME mode that runs 3D-FFT / gather on the CPU: gmx mdrun -pmefft cpu That might give you better CPU/GPU load balance, and sometimes a moderate performance improvement. Otherwise, there are a few things you can do to make better overall use of the machine: - use fewer cores without giving up much performance (e.g. leave 2 cores free for other tasks) -- that's useful if you have other work you can do on the free cores; - run multiple runs to fill "utilization gaps": e.g. run 2-3 concurrent runs with 2-3 cores each. Cheers, -- Szilárd On Fri, Feb 16, 2018 at 12:41 PM, Michael Brunsteinerwrote: > > hi > just installed gmx-2018 on a x86_64 PC with a Geforce GTX 780 and the > cudasoftware directly from the nvidia webpage (didn't work using the debian > nvidia packages) > output of lscpu is included below. > i find that: > 1) 2018 is slightly faster (~5%) than 2016.2) both 2016 and 2018 use the GPU, > but 2018 seems to use less CPU. > With 2016 using the "top" command i usually see that the CPU load is close to > 1200%(i have 6 cores, each two threads) while with 2018 this number is closer > to around 400%(I guess this is because 2018 does PME on the GPU) > my question is: can i possibly further improve the performance of 2018 by1) > somehow convincing gmx to use more CPU, or > 2) run two instances of gmx on this one computer simultaneously?? > thanks in advance for any feedback! > cheers,Michael > > > > prompt> lscpuArchitecture: x86_64 > CPU op-mode(s):32-bit, 64-bit > Byte Order:Little Endian > CPU(s):12 > On-line CPU(s) list: 0-11 > Thread(s) per core:2 > Core(s) per socket:6 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family:6 > Model: 62 > Model name:Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz > Stepping: 4 > CPU MHz: 3399.898 > CPU max MHz: 3900. > CPU min MHz: 1200. > BogoMIPS: 6799.79 > Virtualization:VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 12288K > NUMA node0 CPU(s): 0-11 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx > pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology > nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est > tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt > tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb kaiser tpr_shadow > vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts > > > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a > mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] 2018 performance question
hi just installed gmx-2018 on a x86_64 PC with a Geforce GTX 780 and the cudasoftware directly from the nvidia webpage (didn't work using the debian nvidia packages) output of lscpu is included below. i find that: 1) 2018 is slightly faster (~5%) than 2016.2) both 2016 and 2018 use the GPU, but 2018 seems to use less CPU. With 2016 using the "top" command i usually see that the CPU load is close to 1200%(i have 6 cores, each two threads) while with 2018 this number is closer to around 400%(I guess this is because 2018 does PME on the GPU) my question is: can i possibly further improve the performance of 2018 by1) somehow convincing gmx to use more CPU, or 2) run two instances of gmx on this one computer simultaneously?? thanks in advance for any feedback! cheers,Michael prompt> lscpuArchitecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 62 Model name: Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz Stepping: 4 CPU MHz: 3399.898 CPU max MHz: 3900. CPU min MHz: 1200. BogoMIPS: 6799.79 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0-11 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.