Re: [gmx-users] Free-energy on GMX-2019.1 ( lower performance on GPU) (Mark Abraham)

2019-03-15 Thread Mark Abraham
Hi,

In that case you have 122*87=10614 perturbed atoms in a 91K atom system.
The FEP code in GROMACS is not engineered to run fast anywhere near that
regime. If possible, I'd suggest you explore what you can learn with e.g.
just one drug molecule in a similar system.

Mark

On Fri, 15 Mar 2019 at 12:27 praveen kumar  wrote:

> Dear Mark
>
> I have a system containing formed lipid-bilayer (Phospholipid + drug
> molecules) (~91 K atoms): There are 120 Phospholipids and 87 drug molecules
> in the system box of (8 X 8 X 12). I am trying to grow the all the drug
> molecules (87) (each drug consist of 122 atoms) from decoupled state to
> coupled state using two-stage (TI method). First decoupling the vdw and
> then ele. I have tested with both the simulations these do not run on GPU
> mostly taking CPU to run. I have checked -pme gpu -bonded  gpu  (these are
> not helping me run on GPU)
>
> Thanks
> Praveen
>
>
>
> Hi,
>
> How large is your perturbed region and your normal region? The FEP
> short-ranged kernels run on the CPU, and are not written very well for
> performance. So the larger the perturbed region, the worse things get.
> Because there's a lot of extra CPU work when running FEP, you may see
> improvements from also adding -pme gpu -bonded gpu to your mdrun
> invocation, by moving such work off the CPU.
>
> BTW lincs-order=12 is uselessly large, but is not the problem here.
>
> Mark
>
> On Fri, 15 Mar 2019 at 06:16 praveen kumar  wrote:
>
> > Dear All
> >
> > I am trying to run the free-energy simulation using TI method in gromacs
> > 2019.1 in a GPU machine  (containing two Nvidia Geforce 1080 TI cards ).
> > But unfortunately, am unable to run the free-energy simulation run on
> GPU.
> >
> > The normal MD simulation (without free-energy )is able to run perfectly
> by
> > making use of GPU, which gives us excellent speed up in the simulation.
> > for example, 100 K atoms system is able to give us ~ 80 ns per day on a
> gpu
> > card.  (It uses > 80 % GPU usage)
> > When I am trying to run the free-energy simulations for the same system,
> > the performance drastically falls down to ~0.02 ns per day.  (It uses 0 %
> > GPU usage).
> >
> > I am pasting the MDP files for Normal MD simulation and Free-energy
> > simulation below.
> > npt. mdp (MD simulation)
> >
> >
> > #
> > title= MD simulation
> > ; Run parameters
> > integrator= md; leap-frog integrator
> > nsteps= 1  ; 2 * 6000   = 200 ns
> > dt= 0.002; 2 fs
> > ; Output control
> > nstxout= 10  ; save coordinates every 10.0 ps
> > nstvout= 10  ; save velocities every 10.0 ps
> > nstfout= 10  ; save forces every 10.0 ps
> > nstenergy= 500; save energies every 10.0 ps
> > nstlog= 5000; update log file every 10.0 ps
> > nstxout-compressed  = 5000  ; save compressed coordinates
> every
> > 10.0 ps, nstxout-compressed replaces nstxtcout
> > compressed-x-grps   = System; replaces xtc-grps
> > ; Bond parameters
> > continuation= yes; Restarting after NVT
> > constraint_algorithm= lincs; holonomic constraints
> > constraints= h-bonds; H bonds constrained
> > lincs_iter= 1; accuracy of LINCS
> > lincs_order= 4; also related to accuracy
> > ; Neighborsearching
> > cutoff-scheme   = Verlet
> > ns_type= grid; search neighboring grid cells
> > nstlist= 10; 20 fs, largely irrelevant with Verlet
> > rcoulomb= 1.2; short-range electrostatic cutoff (in nm)
> > rvdw= 1.2; short-range van der Waals cutoff (in nm)
> > rvdw-switch = 1.0
> > vdwtype = cutoff
> > vdw-modifier= force-switch
> > rlist = 1.2
> > ; Electrostatics
> > coulombtype= PME; Particle Mesh Ewald for long-range
> > electrostatics
> > pme_order= 4; cubic interpolation
> > fourierspacing= 0.16; grid spacing for FFT
> > ; Temperature coupling is on
> > tcoupl= V-rescale; modified Berendsen thermostat
> > tc-grps= system; Water   ; two coupling
> > groups - more accurate
> > tau_t= 0.1 ;0.1  ; time constant, in ps
> > ref_t= 360  ;340 ; reference
> > temperature, one for each group, in K
> > ; Pressure coupling is on
> > ;pcoupl  =no
> > pcoupl= Parrinello-Rahman; Pressure coupling on
> in
> > NPT
> > pcoupltype= isotropic; uniform scaling of box
> > vectors
> > tau_p= 2.0; time constant, in ps
> > ref_p= 1.0   ;1.0 

Re: [gmx-users] Free-energy on GMX-2019.1 ( lower performance on GPU) (Mark Abraham)

2019-03-15 Thread praveen kumar
Dear Mark

I have a system containing formed lipid-bilayer (Phospholipid + drug
molecules) (~91 K atoms): There are 120 Phospholipids and 87 drug molecules
in the system box of (8 X 8 X 12). I am trying to grow the all the drug
molecules (87) (each drug consist of 122 atoms) from decoupled state to
coupled state using two-stage (TI method). First decoupling the vdw and
then ele. I have tested with both the simulations these do not run on GPU
mostly taking CPU to run. I have checked -pme gpu -bonded  gpu  (these are
not helping me run on GPU)

Thanks
Praveen



Hi,

How large is your perturbed region and your normal region? The FEP
short-ranged kernels run on the CPU, and are not written very well for
performance. So the larger the perturbed region, the worse things get.
Because there's a lot of extra CPU work when running FEP, you may see
improvements from also adding -pme gpu -bonded gpu to your mdrun
invocation, by moving such work off the CPU.

BTW lincs-order=12 is uselessly large, but is not the problem here.

Mark

On Fri, 15 Mar 2019 at 06:16 praveen kumar  wrote:

> Dear All
>
> I am trying to run the free-energy simulation using TI method in gromacs
> 2019.1 in a GPU machine  (containing two Nvidia Geforce 1080 TI cards ).
> But unfortunately, am unable to run the free-energy simulation run on GPU.
>
> The normal MD simulation (without free-energy )is able to run perfectly by
> making use of GPU, which gives us excellent speed up in the simulation.
> for example, 100 K atoms system is able to give us ~ 80 ns per day on a
gpu
> card.  (It uses > 80 % GPU usage)
> When I am trying to run the free-energy simulations for the same system,
> the performance drastically falls down to ~0.02 ns per day.  (It uses 0 %
> GPU usage).
>
> I am pasting the MDP files for Normal MD simulation and Free-energy
> simulation below.
> npt. mdp (MD simulation)
>
>
> #
> title= MD simulation
> ; Run parameters
> integrator= md; leap-frog integrator
> nsteps= 1  ; 2 * 6000   = 200 ns
> dt= 0.002; 2 fs
> ; Output control
> nstxout= 10  ; save coordinates every 10.0 ps
> nstvout= 10  ; save velocities every 10.0 ps
> nstfout= 10  ; save forces every 10.0 ps
> nstenergy= 500; save energies every 10.0 ps
> nstlog= 5000; update log file every 10.0 ps
> nstxout-compressed  = 5000  ; save compressed coordinates
every
> 10.0 ps, nstxout-compressed replaces nstxtcout
> compressed-x-grps   = System; replaces xtc-grps
> ; Bond parameters
> continuation= yes; Restarting after NVT
> constraint_algorithm= lincs; holonomic constraints
> constraints= h-bonds; H bonds constrained
> lincs_iter= 1; accuracy of LINCS
> lincs_order= 4; also related to accuracy
> ; Neighborsearching
> cutoff-scheme   = Verlet
> ns_type= grid; search neighboring grid cells
> nstlist= 10; 20 fs, largely irrelevant with Verlet
> rcoulomb= 1.2; short-range electrostatic cutoff (in nm)
> rvdw= 1.2; short-range van der Waals cutoff (in nm)
> rvdw-switch = 1.0
> vdwtype = cutoff
> vdw-modifier= force-switch
> rlist = 1.2
> ; Electrostatics
> coulombtype= PME; Particle Mesh Ewald for long-range
> electrostatics
> pme_order= 4; cubic interpolation
> fourierspacing= 0.16; grid spacing for FFT
> ; Temperature coupling is on
> tcoupl= V-rescale; modified Berendsen thermostat
> tc-grps= system; Water   ; two coupling
> groups - more accurate
> tau_t= 0.1 ;0.1  ; time constant, in ps
> ref_t= 360  ;340 ; reference
> temperature, one for each group, in K
> ; Pressure coupling is on
> ;pcoupl  =no
> pcoupl= Parrinello-Rahman; Pressure coupling on in
> NPT
> pcoupltype= isotropic; uniform scaling of box
> vectors
> tau_p= 2.0; time constant, in ps
> ref_p= 1.0   ;1.0 ; reference pressure, in
> bar
> compressibility = 4.5e-5 ; 4.5e-5; isothermal
> compressibility of water, bar^-1
> ; Periodic boundary conditions
> pbc= xyz; 3-D PBC
> ; Dispersion correction
> DispCorr= no; account for cut-off vdW scheme
> ; Velocity generation
> gen_vel= no; Velocity generation is off
> ##
> npt. mdp ( for free-energy simulation)
> 

Re: [gmx-users] Free-energy on GMX-2019.1 ( lower performance on GPU)

2019-03-15 Thread Mark Abraham
Hi,

How large is your perturbed region and your normal region? The FEP
short-ranged kernels run on the CPU, and are not written very well for
performance. So the larger the perturbed region, the worse things get.
Because there's a lot of extra CPU work when running FEP, you may see
improvements from also adding -pme gpu -bonded gpu to your mdrun
invocation, by moving such work off the CPU.

BTW lincs-order=12 is uselessly large, but is not the problem here.

Mark

On Fri, 15 Mar 2019 at 06:16 praveen kumar  wrote:

> Dear All
>
> I am trying to run the free-energy simulation using TI method in gromacs
> 2019.1 in a GPU machine  (containing two Nvidia Geforce 1080 TI cards ).
> But unfortunately, am unable to run the free-energy simulation run on GPU.
>
> The normal MD simulation (without free-energy )is able to run perfectly by
> making use of GPU, which gives us excellent speed up in the simulation.
> for example, 100 K atoms system is able to give us ~ 80 ns per day on a gpu
> card.  (It uses > 80 % GPU usage)
> When I am trying to run the free-energy simulations for the same system,
> the performance drastically falls down to ~0.02 ns per day.  (It uses 0 %
> GPU usage).
>
> I am pasting the MDP files for Normal MD simulation and Free-energy
> simulation below.
> npt. mdp (MD simulation)
>
>
> #
> title= MD simulation
> ; Run parameters
> integrator= md; leap-frog integrator
> nsteps= 1  ; 2 * 6000   = 200 ns
> dt= 0.002; 2 fs
> ; Output control
> nstxout= 10  ; save coordinates every 10.0 ps
> nstvout= 10  ; save velocities every 10.0 ps
> nstfout= 10  ; save forces every 10.0 ps
> nstenergy= 500; save energies every 10.0 ps
> nstlog= 5000; update log file every 10.0 ps
> nstxout-compressed  = 5000  ; save compressed coordinates every
> 10.0 ps, nstxout-compressed replaces nstxtcout
> compressed-x-grps   = System; replaces xtc-grps
> ; Bond parameters
> continuation= yes; Restarting after NVT
> constraint_algorithm= lincs; holonomic constraints
> constraints= h-bonds; H bonds constrained
> lincs_iter= 1; accuracy of LINCS
> lincs_order= 4; also related to accuracy
> ; Neighborsearching
> cutoff-scheme   = Verlet
> ns_type= grid; search neighboring grid cells
> nstlist= 10; 20 fs, largely irrelevant with Verlet
> rcoulomb= 1.2; short-range electrostatic cutoff (in nm)
> rvdw= 1.2; short-range van der Waals cutoff (in nm)
> rvdw-switch = 1.0
> vdwtype = cutoff
> vdw-modifier= force-switch
> rlist = 1.2
> ; Electrostatics
> coulombtype= PME; Particle Mesh Ewald for long-range
> electrostatics
> pme_order= 4; cubic interpolation
> fourierspacing= 0.16; grid spacing for FFT
> ; Temperature coupling is on
> tcoupl= V-rescale; modified Berendsen thermostat
> tc-grps= system; Water   ; two coupling
> groups - more accurate
> tau_t= 0.1 ;0.1  ; time constant, in ps
> ref_t= 360  ;340 ; reference
> temperature, one for each group, in K
> ; Pressure coupling is on
> ;pcoupl  =no
> pcoupl= Parrinello-Rahman; Pressure coupling on in
> NPT
> pcoupltype= isotropic; uniform scaling of box
> vectors
> tau_p= 2.0; time constant, in ps
> ref_p= 1.0   ;1.0 ; reference pressure, in
> bar
> compressibility = 4.5e-5 ; 4.5e-5; isothermal
> compressibility of water, bar^-1
> ; Periodic boundary conditions
> pbc= xyz; 3-D PBC
> ; Dispersion correction
> DispCorr= no; account for cut-off vdW scheme
> ; Velocity generation
> gen_vel= no; Velocity generation is off
> ##
> npt. mdp ( for free-energy simulation)
> ##
>
> ; Run control
> integrator   = sd   ; Langevin dynamics
> tinit= 0
> dt   = 0.002
> nsteps   = 5; 100 ps
> nstcomm  = 100
> ; Output control
> nstxout  = 500
> nstvout  = 500
> nstfout  = 0
> nstlog   = 500
> nstenergy= 500
> nstxout-compressed   = 0
> ; Neighborsearching and short-range nonbonded interactions
> cutoff-scheme= verlet
> 

[gmx-users] Free-energy on GMX-2019.1 ( lower performance on GPU)

2019-03-14 Thread praveen kumar
Dear All

I am trying to run the free-energy simulation using TI method in gromacs
2019.1 in a GPU machine  (containing two Nvidia Geforce 1080 TI cards ).
But unfortunately, am unable to run the free-energy simulation run on GPU.

The normal MD simulation (without free-energy )is able to run perfectly by
making use of GPU, which gives us excellent speed up in the simulation.
for example, 100 K atoms system is able to give us ~ 80 ns per day on a gpu
card.  (It uses > 80 % GPU usage)
When I am trying to run the free-energy simulations for the same system,
the performance drastically falls down to ~0.02 ns per day.  (It uses 0 %
GPU usage).

I am pasting the MDP files for Normal MD simulation and Free-energy
simulation below.
npt. mdp (MD simulation)


#
title= MD simulation
; Run parameters
integrator= md; leap-frog integrator
nsteps= 1  ; 2 * 6000   = 200 ns
dt= 0.002; 2 fs
; Output control
nstxout= 10  ; save coordinates every 10.0 ps
nstvout= 10  ; save velocities every 10.0 ps
nstfout= 10  ; save forces every 10.0 ps
nstenergy= 500; save energies every 10.0 ps
nstlog= 5000; update log file every 10.0 ps
nstxout-compressed  = 5000  ; save compressed coordinates every
10.0 ps, nstxout-compressed replaces nstxtcout
compressed-x-grps   = System; replaces xtc-grps
; Bond parameters
continuation= yes; Restarting after NVT
constraint_algorithm= lincs; holonomic constraints
constraints= h-bonds; H bonds constrained
lincs_iter= 1; accuracy of LINCS
lincs_order= 4; also related to accuracy
; Neighborsearching
cutoff-scheme   = Verlet
ns_type= grid; search neighboring grid cells
nstlist= 10; 20 fs, largely irrelevant with Verlet
rcoulomb= 1.2; short-range electrostatic cutoff (in nm)
rvdw= 1.2; short-range van der Waals cutoff (in nm)
rvdw-switch = 1.0
vdwtype = cutoff
vdw-modifier= force-switch
rlist = 1.2
; Electrostatics
coulombtype= PME; Particle Mesh Ewald for long-range
electrostatics
pme_order= 4; cubic interpolation
fourierspacing= 0.16; grid spacing for FFT
; Temperature coupling is on
tcoupl= V-rescale; modified Berendsen thermostat
tc-grps= system; Water   ; two coupling
groups - more accurate
tau_t= 0.1 ;0.1  ; time constant, in ps
ref_t= 360  ;340 ; reference
temperature, one for each group, in K
; Pressure coupling is on
;pcoupl  =no
pcoupl= Parrinello-Rahman; Pressure coupling on in
NPT
pcoupltype= isotropic; uniform scaling of box
vectors
tau_p= 2.0; time constant, in ps
ref_p= 1.0   ;1.0 ; reference pressure, in
bar
compressibility = 4.5e-5 ; 4.5e-5; isothermal
compressibility of water, bar^-1
; Periodic boundary conditions
pbc= xyz; 3-D PBC
; Dispersion correction
DispCorr= no; account for cut-off vdW scheme
; Velocity generation
gen_vel= no; Velocity generation is off
##
npt. mdp ( for free-energy simulation)
##

; Run control
integrator   = sd   ; Langevin dynamics
tinit= 0
dt   = 0.002
nsteps   = 5; 100 ps
nstcomm  = 100
; Output control
nstxout  = 500
nstvout  = 500
nstfout  = 0
nstlog   = 500
nstenergy= 500
nstxout-compressed   = 0
; Neighborsearching and short-range nonbonded interactions
cutoff-scheme= verlet
nstlist  = 20
ns_type  = grid
pbc  = xyz
rlist= 1.2
; Electrostatics
coulombtype  = PME
rcoulomb = 1.2
; van der Waals
vdwtype  = cutoff
vdw-modifier = potential-switch
rvdw-switch  = 1.0
rvdw = 1.2
; Apply long range dispersion corrections for Energy and Pressure
DispCorr  = EnerPres
; Spacing for the PME/PPPM FFT grid
fourierspacing   = 0.12
; EWALD/PME/PPPM parameters
pme_order= 6
ewald_rtol   = 1e-06
epsilon_surface  = 0
; Temperature coupling
; tcoupl is implicitly handled by the sd