On Thu, Jun 16, 2016 at 4:01 PM, Mark Abraham <mark.j.abra...@gmail.com> wrote:
> Hi, > > There's just nothing special about any node at run time. > > Your script looks like it is building GROMACS fresh each time - there's no > need to do that, which part of my script ? I always use this command to restart from checkpoint file --> "mpirun gmx_mpi mdrun -cpi [name].cpt -deffnm [name]". as far as I know -cpi option is used to refer to checkpoint file as input file. what I have to change in my script ? but the fact that the node name is showing up in the check > that takes place when the checkpoint is read is not relevant to the > problem. > > Mark > > On Thu, Jun 16, 2016 at 9:46 AM Husen R <hus...@gmail.com> wrote: > > > On Thu, Jun 16, 2016 at 2:32 PM, Mark Abraham <mark.j.abra...@gmail.com> > > wrote: > > > > > Hi, > > > > > > On Thu, Jun 16, 2016 at 9:30 AM Husen R <hus...@gmail.com> wrote: > > > > > > > Hi, > > > > > > > > Thank you for your reply ! > > > > > > > > md_test.xtc is exist and writable. > > > > > > > > > > OK, but it needs to be seen that way from the set of compute nodes you > > are > > > using, and organizing that is up to you and your job scheduler, etc. > > > > > > > > > > I tried to restart from checkpoint file by excluding other node than > > > > compute-node and it works. > > > > > > > > > > Go do that, then :-) > > > > > > > I'm building a simple system that can respond to node failure. if failure > > occured on node A, than the application has to be restarted and that node > > has to be excluded. > > this should apply to all node including this 'compute-node'. > > > > > > > > > > > > only '--exclude=compute-node' that produces this error. > > > > > > > > > > Then there's something about that node that is special with respect to > > the > > > file system - there's nothing about any particular node that GROMACS > > cares > > > about. > > > > > > > > Mark > > > > > > > > > > is this has the same issue with this thread ? > > > > http://comments.gmane.org/gmane.science.biology.gromacs.user/40984 > > > > > > > > regards, > > > > > > > > Husen > > > > > > > > On Thu, Jun 16, 2016 at 2:20 PM, Mark Abraham < > > mark.j.abra...@gmail.com> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > The stuff about different nodes or numbers of nodes doesn't matter > - > > > it's > > > > > merely an advisory note from mdrun. mdrun failed when it tried to > > > operate > > > > > upon md_test.xtc, so perhaps you need to consider whether the file > > > > exists, > > > > > is writable, etc. > > > > > > > > > > Mark > > > > > > > > > > On Thu, Jun 16, 2016 at 6:48 AM Husen R <hus...@gmail.com> wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > I got the following error message when I tried to restart gromacs > > > > > > simulation from checkpoint file. > > > > > > I restart the simulation using fewer nodes and processes, and > also > > I > > > > > > exclude one node using '--exclude=' option (in slurm) for > > > experimental > > > > > > purpose. > > > > > > > > > > > > I'm sure fewer nodes and processes are not the cause of this > error > > > as I > > > > > > already test that. > > > > > > I have checked that the cause of this error is '--exclude=' > usage. > > I > > > > > > excluded 1 node named 'compute-node' when restart from checkpoint > > (at > > > > > first > > > > > > run, I use all node including 'compute-node'). > > > > > > > > > > > > > > > > > > it seems that at first run, the submit job script was built at > > > > > > compute-node. So, at restart, build user mismatch appeared > because > > > > > > compute-node was not found (excluded). > > > > > > > > > > > > Am I right ? is this behavior normal ? > > > > > > or is that a way to avoid this, so I can freely restart from > > > checkpoint > > > > > > using any nodes without limitation. > > > > > > > > > > > > thank you in advance > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > Husen > > > > > > > > > > > > ==========================restart script================= > > > > > > #!/bin/bash > > > > > > #SBATCH -J ayo > > > > > > #SBATCH -o md%j.out > > > > > > #SBATCH -A necis > > > > > > #SBATCH -N 2 > > > > > > #SBATCH -n 16 > > > > > > #SBATCH --exclude=compute-node > > > > > > #SBATCH --time=144:00:00 > > > > > > #SBATCH --mail-user=hus...@gmail.com > > > > > > #SBATCH --mail-type=begin > > > > > > #SBATCH --mail-type=end > > > > > > > > > > > > mpirun gmx_mpi mdrun -cpi md_test.cpt -deffnm md_test > > > > > > ===================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ==================================output > > > error======================== > > > > > > Reading checkpoint file md_test.cpt generated: Wed Jun 15 > 16:30:44 > > > 2016 > > > > > > > > > > > > > > > > > > Build time mismatch, > > > > > > current program: Sel Apr 5 13:37:32 WIB 2016 > > > > > > checkpoint file: Rab Apr 6 09:44:51 WIB 2016 > > > > > > > > > > > > Build user mismatch, > > > > > > current program: pro@head-node [CMAKE] > > > > > > checkpoint file: pro@compute-node [CMAKE] > > > > > > > > > > > > #ranks mismatch, > > > > > > current program: 16 > > > > > > checkpoint file: 24 > > > > > > > > > > > > #PME-ranks mismatch, > > > > > > current program: -1 > > > > > > checkpoint file: 6 > > > > > > > > > > > > GROMACS patchlevel, binary or parallel settings differ from > > previous > > > > run. > > > > > > Continuation is exact, but not guaranteed to be binary identical. > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > > Program gmx mdrun, VERSION 5.1.2 > > > > > > Source code file: > > > > > > /home/pro/gromacs-5.1.2/src/gromacs/gmxlib/checkpoint.cpp, line: > > 2216 > > > > > > > > > > > > Fatal error: > > > > > > Truncation of file md_test.xtc failed. Cannot do appending > because > > of > > > > > this > > > > > > failure. > > > > > > For more information and tips for troubleshooting, please check > the > > > > > GROMACS > > > > > > website at http://www.gromacs.org/Documentation/Errors > > > > > > ------------------------------------------------------- > > > > > > ================================================================ > > > > > > -- > > > > > > Gromacs Users mailing list > > > > > > > > > > > > * Please search the archive at > > > > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List > before > > > > > > posting! > > > > > > > > > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > > > > > > > > > * For (un)subscribe requests visit > > > > > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users > > > or > > > > > > send a mail to gmx-users-requ...@gromacs.org. > > > > > > > > > > > -- > > > > > Gromacs Users mailing list > > > > > > > > > > * Please search the archive at > > > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > > > > > posting! > > > > > > > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > > > > > > > * For (un)subscribe requests visit > > > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users > > or > > > > > send a mail to gmx-users-requ...@gromacs.org. > > > > > > > > > -- > > > > Gromacs Users mailing list > > > > > > > > * Please search the archive at > > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > > > > posting! > > > > > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > > > > > * For (un)subscribe requests visit > > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users > or > > > > send a mail to gmx-users-requ...@gromacs.org. > > > > > > > -- > > > Gromacs Users mailing list > > > > > > * Please search the archive at > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > > > posting! > > > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > > > * For (un)subscribe requests visit > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > > > send a mail to gmx-users-requ...@gromacs.org. > > > > > -- > > Gromacs Users mailing list > > > > * Please search the archive at > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > > posting! > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > * For (un)subscribe requests visit > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > > send a mail to gmx-users-requ...@gromacs.org. > > > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.