Dear Dimitar,
I'm following the debate regarding:
The point was not "why" I was getting the restarts, but the fact
itself that I was getting restarts close in time, as I stated in my
first post. I actually also don't know whether jobs are deleted or
suspended. I've thought that a job returned back to the queue will
basically start from the beginning when later moved to an empty slot
... so don't understand the difference from that perspective.
In the second mail yoo say:
Submitted by:
========================
ii=1
ifmpi="mpirun -np $NSLOTS"
--------
if [ ! -f run${ii}-i.tpr ];then
cp run${ii}.tpr run${ii}-i.tpr
tpbconv -s run${ii}-i.tpr -until 200000 -o run${ii}.tpr
fi
k=`ls md-${ii}*.out | wc -l`
outfile="md-${ii}-$k.out"
if [[ -f run${ii}.cpt ]]; then
* $ifmpi `which mdrun` *-s run${ii}.tpr -cpi run${ii}.cpt -v -deffnm
run${ii} -npme 0 > $outfile 2>&1
fi
=========================
If I understand well, you are submitting the SERIAL mdrun. This means
that multiple instances of mdrun are running at the same time.
Each instance of mdrun is an INDIPENDENT instance. Therefore checkpoint
files, one for each instance (i.e. one for each CPU), are written at
the same time.
--
gmx-users mailing list gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists