Re: [gmx-users] 2018 performance question

2018-02-27 Thread Michael Brunsteiner
<pall.szil...@gmail.com>
 To: Michael Brunsteiner <mbx0...@yahoo.com> 
 Sent: Thursday, February 22, 2018 4:15 PM
 Subject: Re: [gmx-users] 2018 performance question
   
Hi,

What I meant is _not_ that you should scale the GPU-accelerated
GROMACS across multiple GPUs -- the scaling efficiency is not great
and depends a lot on the system size.

What I meant instead is that the GROMACS 2018 release now requires
fewer cores/GPU to get near peak performance (in single-GPU mode) and
therefore, whereas you may not see a lot of improvement if you just
keep using the 6 cores/GPU in your machine, you generally should be
able to run 2-3 simulations side-by-side, each on a separate GPU, but
each using only 2-3 cores.

Cheers,
--
Szilárd


On Thu, Feb 22, 2018 at 9:59 AM, Michael Brunsteiner <mbx0...@yahoo.com> wrote:
> From: Szilárd Páll <pall.szil...@gmail.com>
>
>
> To: Michael Brunsteiner <mbx0...@yahoo.com>
> Sent: Wednesday, February 21, 2018 9:16 PM
> Subject: Re: [gmx-users] 2018 performance question
>
>> Hi Michael,
>>
>> Why not use both GPUs, you should be able to get up to 80% performance
>> on just 3 of the 6 cores.
>
> its a bit more complicated. i have 4 machines with identical hardware
> but one of the 4 graphic cards started giving me seg-faults lately,
> some of the RAM must be broken there ...
> so i need to replace that in any  case ... but if you suggest that gmx
> can use two different graphics cards (780 and 1060) simultaneously I'll
> certainly give that a try on the 3 others.
> thanks for your help!
> regards,
> Michael
>
>
>

   
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] 2018 performance question

2018-02-20 Thread Szilárd Páll
Hi Michael,

What you observe is most likely due to v2018 by default shifting the
PME work to the GPU which will often mean fewer CPU cores are needed
and runs become more GPU-bound leaving the CPU without work for part
of the runtime. This should be easily seen by comparing the log files.

Especially with older GPUs (approx >2 gen old Kepler or earlier)
running only part of the PME work on the GPU can be useful. This can
be done by using the hybrid PME mode that runs 3D-FFT / gather on the
CPU:
gmx mdrun -pmefft cpu

That might give you better CPU/GPU load balance, and sometimes a
moderate performance improvement. Otherwise, there are a few things
you can do to make better overall use of the machine:
- use fewer cores without giving up much performance (e.g. leave 2
cores free for other tasks) -- that's useful if you have other work
you can do on the free cores;
- run multiple runs to fill "utilization gaps": e.g. run 2-3
concurrent runs with 2-3 cores each.

Cheers,
--
Szilárd


On Fri, Feb 16, 2018 at 12:41 PM, Michael Brunsteiner  wrote:
>
> hi
> just installed gmx-2018 on a x86_64 PC with a Geforce GTX 780 and the 
> cudasoftware directly from the nvidia webpage (didn't work using the debian 
> nvidia packages)
> output of lscpu is included below.
> i find that:
> 1) 2018 is slightly faster (~5%) than 2016.2) both 2016 and 2018 use the GPU, 
> but 2018 seems to use less CPU.
> With 2016 using the "top" command i usually see that the CPU load is close to 
> 1200%(i have 6 cores, each two threads) while with 2018 this number is closer 
> to around 400%(I guess this is because 2018 does PME on the GPU)
> my question is: can i possibly further improve the performance of 2018 by1) 
> somehow convincing gmx to use more CPU, or
> 2) run two instances of gmx on this one computer simultaneously??
> thanks in advance for any feedback!
> cheers,Michael
>
>
>
> prompt> lscpuArchitecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):12
> On-line CPU(s) list:   0-11
> Thread(s) per core:2
> Core(s) per socket:6
> Socket(s): 1
> NUMA node(s):  1
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 62
> Model name:Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz
> Stepping:  4
> CPU MHz:   3399.898
> CPU max MHz:   3900.
> CPU min MHz:   1200.
> BogoMIPS:  6799.79
> Virtualization:VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  12288K
> NUMA node0 CPU(s): 0-11
> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
> pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
> nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est 
> tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt 
> tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb kaiser tpr_shadow 
> vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

[gmx-users] 2018 performance question

2018-02-16 Thread Michael Brunsteiner

hi
just installed gmx-2018 on a x86_64 PC with a Geforce GTX 780 and the 
cudasoftware directly from the nvidia webpage (didn't work using the debian 
nvidia packages)
output of lscpu is included below.
i find that:
1) 2018 is slightly faster (~5%) than 2016.2) both 2016 and 2018 use the GPU, 
but 2018 seems to use less CPU.
With 2016 using the "top" command i usually see that the CPU load is close to 
1200%(i have 6 cores, each two threads) while with 2018 this number is closer 
to around 400%(I guess this is because 2018 does PME on the GPU)
my question is: can i possibly further improve the performance of 2018 by1) 
somehow convincing gmx to use more CPU, or
2) run two instances of gmx on this one computer simultaneously??
thanks in advance for any feedback!
cheers,Michael



prompt> lscpuArchitecture:  x86_64
CPU op-mode(s):    32-bit, 64-bit
Byte Order:    Little Endian
CPU(s):    12
On-line CPU(s) list:   0-11
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s): 1
NUMA node(s):  1
Vendor ID: GenuineIntel
CPU family:    6
Model: 62
Model name:    Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz
Stepping:  4
CPU MHz:   3399.898
CPU max MHz:   3900.
CPU min MHz:   1200.
BogoMIPS:  6799.79
Virtualization:    VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  256K
L3 cache:  12288K
NUMA node0 CPU(s): 0-11
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 
ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer 
aes xsave avx f16c rdrand lahf_lm epb kaiser tpr_shadow vnmi flexpriority ept 
vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts


-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.