Did you play with the time step? Just currious, but I woundered what happened
with 0.0008, 0.0005, 0.0002. I found if I had a good behaving protein, as soon
as I added a small (non-protein) molecule which rotated wildly while attached
to the protein, it would crash unless I reduced the time step to the above when
constraints were removed after EQ ... always it seemed to me it didnt like the
rotation or bond angles, seeing them as a violation but acted like it was an
amino acid? (the same bond type but with wider rotation as one end wasnt fixed
to a chain) If your loop moves via backbone, the calculated angles, bonds or
whatever might appear to the computer to be violating the parameter settings
for problems, errors, etc as it cant track them fast enough over the time step.
Ie atom 1-2-3 and then delta 1-2-3 with xyz parameters, but then the particular
set has additional rotation, etc and may include the chain atoms which bend
wildly (n-Ca-Cb-Cg maybe a dihedral) but probab
ly not this.
Just a thought but probably not the right answere as well, it might be the way
it is broken down (above) over GPUs, which convert everything to matricies
(non-standard just for basic math operations not real matricies per say) for
exicution and then some library problem which would not account for long range
rapid (0.0005) movements at the chain (Ca,N,O to something else) and then tries
to apply these to Cb-Cg-O-H, etc using the initial points while looking at the
parameters for say a single amino acid...Maybe the constraints would cause
this, which would make it a pain to EQ, but this allowed me to increase the
time step, but would ruin the experiment I had worked on as I needed it
unconstrained to show it didnt float away when proteins were pulled, etc...I
was using a different integrator though...just normal MD.
ANd your cutoffs for vdw, etc...Why are they 0? I dont know if this means a
defautl set is then used...but if not ? Wouldnt they try integrating using
both types of formula, or would it be just using coulumb or vice versa? (dont
know what that would do to the code but assume it means no vdw, and all coulumb
but then zeros are alwyas a problem for computers).
Thats my thoughts on that. Probably something else though.
Good luck,
Stephan
Original-Nachricht
Datum: Wed, 06 Jun 2012 18:42:45 -0400
Von: Justin A. Lemkul jalem...@vt.edu
An: Discussion list for GROMACS users gmx-users@gromacs.org
Betreff: [gmx-users] GPU crashes
Hi All,
I'm wondering if anyone has experienced what I'm seeing with Gromacs 4.5.5
on
GPU. It seems that certain systems fail inexplicably. The system I am
working
with is a heterodimeric protein complex bound to DNA. After about 1 ns of
simulation time using mdrun-gpu, all the energies become NaN. The
simulations
don't stop, they just carry on merrily producing nonsense. I would love
to see
some action regarding http://redmine.gromacs.org/issues/941 for this
reason ;)
I ran simulations of each of the components of the system individually -
each
protein alone, and DNA - to try to track down what might be causing this
problem. The DNA simulation is perfectly stable out to 10 ns, but each
protein
fails within 2 ns. Each protein has two domains with a flexible linker,
and it
seems that as soon as the linker flexes a bit, the simulations go poof.
Well-behaved proteins like lysozyme and DHFR (from the benchmark set) seem
fine,
but anything that twitches even a small amount fails. This is very
unfortunate
for us, as we are hoping to see domain motions on a feasible time scale
using
implicit solvent on GPU hardware.
Has anyone seen anything like this? Our Gromacs implementation is being
run on
an x86_64 Linux system with Tesla S2050 GPU cards. The CUDA version is
3.1 and
Gromacs is linked against OpenMM-2.0. An .mdp file is appended below. I
have
also tested finite values for cutoffs, but the results were worse
(failures
occurred more quickly).
I have not been able to use the latest git version of Gromacs to test
whether
anything has been fixed, but will post separately to gmx-developers
regarding
the reasons for that soon.
-Justin
=== md.mdp ===
title = Implicit solvent test
; Run parameters
integrator = sd
dt = 0.002
nsteps = 500 ; 1 ps (10 ns)
nstcomm = 1
comm_mode = angular ; non-periodic system
; Output parameters
nstxout = 0
nstvout = 0
nstfout = 0
nstxtcout = 1000 ; every 2 ps
nstlog = 5000 ; every 10 ps
nstenergy = 1000 ; every 2 ps
; Bond parameters
constraint_algorithm= lincs
constraints = all-bonds
continuation= no; starting up
; required cutoffs for implicit
nstlist = 0
ns_type = grid
rlist = 0
rcoulomb= 0
rvdw= 0