Re: [gmx-users] Updating GTX670 PCIE speed from 5GT/s to 8GT/s resulted in about 10% speedup of md_run.

2013-12-04 Thread Henk Neefs
Your suspicion is correct, the CPU is waiting >20% of the runtime for the GPU
to finish.
--
Henk


--
View this message in context: 
http://gromacs.5086.x6.nabble.com/Updating-GTX670-PCIE-speed-from-5GT-s-to-8GT-s-resulted-in-about-10-speedup-of-md-run-tp5012945p5013082.html
Sent from the GROMACS Users Forum mailing list archive at Nabble.com.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Updating GTX670 PCIE speed from 5GT/s to 8GT/s resulted in about 10% speedup of md_run.

2013-12-03 Thread Henk Neefs
Below information might be of interest to the Gromacs
development/optimization team.

What can we derive from the 10% md_run speedup when PCIE3.0 speed increases
from 5GT/s->8GT/s?

A 60% PCIE speed increase results in a 10% run time reduction. 
Hence about 10/60=17% of the run time gets spent in (non-overlapping) PCIE
bus communication for this particular configuration and for this particular
simulated molecular system.
I'm refering to the "non-overlapping" part as this is the part that is not
hidden by (not overlapped with) calculations.

So changing the PCIE speed provides a (non-user-friendly) knob to the
gromacs developers to estimate the part of the run time that is determined
by the (non-overlapping) PCIE bus communication.

Not sure whether the Nvidia CUDA profiling environment provides a better way
to quantify this. In case there isn't a better way, above method is a poor
man's flow (for which you likely need root access) to provide this
quantification.
--
Henk Neefs
Gromacs user


--
View this message in context: 
http://gromacs.5086.x6.nabble.com/Updating-GTX670-PCIE-speed-from-5GT-s-to-8GT-s-resulted-in-about-10-speedup-of-md-run-tp5012945p5013031.html
Sent from the GROMACS Users Forum mailing list archive at Nabble.com.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


[gmx-users] Updating GTX670 PCIE speed from 5GT/s to 8GT/s resulted in about 10% speedup of md_run.

2013-11-30 Thread Henk Neefs
By configuring 8 GT/s PCIE 3.0 for the Nvidia driver, I got a 10% speedup
on Gromacs md_run.
   45 ns/day -> 50 ns/day (1AKI protein).

This posting is just informational, my findings on how to do this, so
others can possibly also exploit this if they desire so. There is no
question that i'm asking here.

Config: Intel Ivytown (Ivybridge family processor (i7-4960X).
Nvidia GeForce GTX670. Single GPU card installed.
ASUS X79 Deluxe Motherboard. Single socket system.
Fedora 19
Nvidia driver: 331.20
Application: md_run
Gromacs 4.6.4
1AKI protein from tutorial
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin/gmx-tutorials/lysozyme/index.html

To measure PCIe Link Speed, do either of:
  1. nvidia-settings  (from cmd-line, this is part of the Nvidia drivers
package).
 Look under PowerMizer to see what PCIe Link Speed is presently used
(will change under load).
 run md_run to generate a load. If 8GT/s is not enabled then expect to
see 5 GT/s during an md_run.

  2. lspci -vv | grep -i nvidia
 Use the device PCI reg address to get details (needs root privileges):
 lspci -vv  -s 01:00.0
 You would see something like:
LnkCap: Port #0, Speed 8GT/s, Width x16=> PCIE Capability is
set to 8 GT/s here (this is after applying below settings).
LnkSta: Speed 2.5GT/s, Width x16  => PCIE Link State
presently low speed as md_run is not running (to save power).

If the capability shows 5 GT/s (or speed is just 5 GT/s under md_run load)
then you can try below setting to elevate to 8 GT/s.

I'm using the parameter NVreg_EnablePCIeGen3=1 as provided by the Nvidia
driver.

1. I tried it first during a single Fedora boot and ran md_run (a 30 mins
wall-time example) to determine whether it's stable.
   During boot, when the boot options show up for the images:
Press 'e' to edit the cmd line.
Add to the cmd line:
  nvidia.NVreg_EnablePCIeGen3=1
Then 'Ctrl-X' to boot with the cmd-line (note: this option will be
forgotten on the next boot).
Run md_run or some other tests that heavily exercise the graphics card
to gauge the stability.

2. To make the change permanent (as root):
edit /etc/default/grub and add to the GRUB_CMDLINE_LINUX option:
  nvidia.NVreg_EnablePCIeGen3=1
grub2-mkconfig -o /boot/grub2/grub.cfg
   Now reboot and the setting will take effect from then onwards (for every
boot).

FYI: Note that I'm using an Intel i7-4960X (Ivybridge family: 15MB, 6
Cores). It seems to support PCIE 3.0 with 8 GT/s (I'm not speaking for
Intel).

Disclaimer: Nvidia does not guarantee link/system stability when doing
this. I don't either. It works for me. Your mileage may vary. I only have a
single GPU card on the PCIE bus.

--
Henk Neefs
Gromacs User
Computer Architect (Ivytown chip architect at Intel).
Not speaking for Intel, these are all personal opinions.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.