Dear Dimitar,
I'm following the debate regarding:

The point was not "why" I was getting the restarts, but the fact itself that I was getting restarts close in time, as I stated in my first post. I actually also don't know whether jobs are deleted or suspended. I've thought that a job returned back to the queue will basically start from the beginning when later moved to an empty slot ... so don't understand the difference from that perspective.

In the second mail yoo say:

Submitted by:
========================
ii=1
ifmpi="mpirun -np $NSLOTS"
--------
   if [ ! -f run${ii}-i.tpr ];then
      cp run${ii}.tpr run${ii}-i.tpr
      tpbconv -s run${ii}-i.tpr -until 200000 -o run${ii}.tpr
   fi

   k=`ls md-${ii}*.out | wc -l`
   outfile="md-${ii}-$k.out"
   if [[ -f run${ii}.cpt ]]; then
* $ifmpi `which mdrun` *-s run${ii}.tpr -cpi run${ii}.cpt -v -deffnm run${ii} -npme 0 > $outfile 2>&1

   fi
=========================


If I understand well, you are submitting the SERIAL mdrun. This means that multiple instances of mdrun are running at the same time. Each instance of mdrun is an INDIPENDENT instance. Therefore checkpoint files, one for each instance (i.e. one for each CPU), are written at the same time.
-- 
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Reply via email to