On 10/7/12 2:15 PM, Ladasky wrote:
Justin Lemkul wrote
Random segmentation faults are really hard to debug.  Can you resume the
run
using a checkpoint file?  That would suggest maybe an MPI problem or
something
else external to Gromacs.  Without a reproducible system and a debugging
backtrace, it's going to be hard to figure out where the problem is coming
from.

Thanks for that tip, Justin.  I tried to resume one run which failed at 1.06
million cycles, and it WORKED.  It proceeded all the way to the 2.50 million
cycles that I designated.  I now have two separate .trr files, but I suppose
they can be merged.

I don't know whether my crashes are random yet.  I will try re-running that
simulation again from time zero, to see whether it segfaults at the same
place.  If it doesn't, then I have a problem which may have nothing to do
with GROMACS.

I looked in on memory usage several times while mdrun_mpi was executing.
Over all, about 3 GB of my computer's 8 GB of RAM were in use.  As I
expected, GROMACS used very little of this.  The mpirun process used a
constant 708K.  I had five mdrun_mpi processes, all of which used slightly
more RAM as they worked, but I didn't notice anything which suggested a
gross memory leak.  The process which used the most RAM was using 14.4 MB
right after it started, rose to 15.9 MB within the first ten minutes or so,
and reached 16.0 MB after four hours.  The process which used the least RAM
started at 10.6 MB and finished at 10.8 MB.  All together, GROMACS was using
about 64 MB.

I have a well-cooled CPU, core temperatures are under 50 degrees when the
system is running under full load.  My system doesn't lock up or crash on
me.  I think that my hardware is good.



My first guess would be a buggy MPI implementation. I can't comment on hardware specs, but usually the random failures seen in mdrun_mpi are a result of some generic MPI failure. What MPI are you using?

-Justin

--
========================================

Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Reply via email to