Re: [gmx-users] Restarting crashed simulation
Hi, Looks like you made a typo with state.cpt and that you perhaps have multiple mdrun processes running such that the actual output file is in one of the backup files labelled with # characters. Mark On Fri, 17 Nov 2017 19:37 Ali Ahmedwrote: > Hello GROMACS users > My MD simulation was crashed then I restarted the simulation from the point > when the point was written using this command on 64 processors: mpirun -np > 64 mdrun_mpi -s md.tpr -cpi stat.cpt > > After few days I got nothing in the folder usch as output.gro and I got the > following > ___ > Command line: > mdrun_mpi -s md.tpr -cpi stat.cpt > > Warning: No checkpoint file found with -cpi option. Assuming this is a new > run. > > > Back Off! I just backed up md.log to ./#md.log.2# > > Running on 4 nodes with total 64 cores, 64 logical cores > Cores per node: 16 > Logical cores per node: 16 > Hardware detected on host compute-2-27.local (the node of MPI rank 0): > CPU info: > Vendor: Intel > Brand: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz > SIMD instructions most likely to fit this hardware: AVX_256 > SIMD instructions selected at GROMACS compile time: AVX_256 > > Hardware topology: Basic > > Reading file md.tpr, VERSION 2016.3 (single precision) > Changing nstlist from 10 to 40, rlist from 1 to 1.003 > > Will use 48 particle-particle and 16 PME only ranks > This is a guess, check the performance at the end of the log file > Using 64 MPI processes > Using 1 OpenMP thread per MPI process > > Non-default thread affinity set probably by the OpenMP library, > disabling internal thread affinity > WARNING: This run will generate roughly 50657 Mb of data > > starting mdrun 'Molecular Dynamics' > 2500 steps, 5.0 ps. > > step 888000 Turning on dynamic load balancing, because the performance loss > due to load imbalance is 8.7 %. > step 930400 Turning off dynamic load balancing, because it is degrading > performance. > step 1328000 Turning on dynamic load balancing, because the performance > loss due to load imbalance is 3.4 %. > step 1328800 Turning off dynamic load balancing, because it is degrading > performance. > step 1336000 Turning on dynamic load balancing, because the performance > loss due to load imbalance is 3.4 %. > step 1338400 Turning off dynamic load balancing, because it is degrading > performance. > step 134 Will no longer try dynamic load balancing, as it degraded > performance. > Writing final coordinates. > Average load imbalance: 13.2 % > Part of the total run time spent waiting due to load imbalance: 7.5 % > Average PME mesh/force load: 1.077 > Part of the total run time spent waiting due to PP/PME imbalance: 4.1 % > > NOTE: 7.5 % of the available CPU time was lost due to load imbalance > in the domain decomposition. > You might want to use dynamic load balancing (option -dlb.) > > >Core t (s) Wall t (s)(%) >Time: 26331875.601 411435.556 6400.0 > 4d18h17:15 > (ns/day)(hour/ns) > Performance: 10.5002.286 > _ > > Any advise or suggestion will be helpful. > > Thanks in advance > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Restarting crashed simulation
Hello GROMACS users My MD simulation was crashed then I restarted the simulation from the point when the point was written using this command on 64 processors: mpirun -np 64 mdrun_mpi -s md.tpr -cpi stat.cpt After few days I got nothing in the folder usch as output.gro and I got the following ___ Command line: mdrun_mpi -s md.tpr -cpi stat.cpt Warning: No checkpoint file found with -cpi option. Assuming this is a new run. Back Off! I just backed up md.log to ./#md.log.2# Running on 4 nodes with total 64 cores, 64 logical cores Cores per node: 16 Logical cores per node: 16 Hardware detected on host compute-2-27.local (the node of MPI rank 0): CPU info: Vendor: Intel Brand: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz SIMD instructions most likely to fit this hardware: AVX_256 SIMD instructions selected at GROMACS compile time: AVX_256 Hardware topology: Basic Reading file md.tpr, VERSION 2016.3 (single precision) Changing nstlist from 10 to 40, rlist from 1 to 1.003 Will use 48 particle-particle and 16 PME only ranks This is a guess, check the performance at the end of the log file Using 64 MPI processes Using 1 OpenMP thread per MPI process Non-default thread affinity set probably by the OpenMP library, disabling internal thread affinity WARNING: This run will generate roughly 50657 Mb of data starting mdrun 'Molecular Dynamics' 2500 steps, 5.0 ps. step 888000 Turning on dynamic load balancing, because the performance loss due to load imbalance is 8.7 %. step 930400 Turning off dynamic load balancing, because it is degrading performance. step 1328000 Turning on dynamic load balancing, because the performance loss due to load imbalance is 3.4 %. step 1328800 Turning off dynamic load balancing, because it is degrading performance. step 1336000 Turning on dynamic load balancing, because the performance loss due to load imbalance is 3.4 %. step 1338400 Turning off dynamic load balancing, because it is degrading performance. step 134 Will no longer try dynamic load balancing, as it degraded performance. Writing final coordinates. Average load imbalance: 13.2 % Part of the total run time spent waiting due to load imbalance: 7.5 % Average PME mesh/force load: 1.077 Part of the total run time spent waiting due to PP/PME imbalance: 4.1 % NOTE: 7.5 % of the available CPU time was lost due to load imbalance in the domain decomposition. You might want to use dynamic load balancing (option -dlb.) Core t (s) Wall t (s)(%) Time: 26331875.601 411435.556 6400.0 4d18h17:15 (ns/day)(hour/ns) Performance: 10.5002.286 _ Any advise or suggestion will be helpful. Thanks in advance -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Restarting crashed simulation
Hi, My simulation stopped because of power outage and I restarted it using the command: mpirun -np 32 mdrun_mpi -cpi mdrun.cpt -s mdrun.cpt It shows correctly the restarting time point, however the mdrun.log is not getting updated and I don't know how to verify whether the simulation is being continued or not ? -- *Best Regards* Bharat -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.