Re: [gmx-users] continue replica exchange MD
On Wed, Mar 21, 2012 at 11:36 PM, Kukol, Andreas a.ku...@herts.ac.uk wrote: Hello, Upon continuing a replica exchange MD simulation using the command mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 1 -cpt 60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e edrRemd_20ns.edr -stepout 2000 From my side, I have no problem resuming or extending the REMD simulations in V.4.5.5 and 4.5.4 Here is the command: mdrun_g_f -s md_.tpr -multi 32 -replex 500 -cpi state_.cpt -append I use state_.cpt, not state.cpt I get the following output: ** ... ... 500 steps, 1.0 ps (continuing from step 49430, 98.9 ps). 500 steps, 1.0 ps (continuing from step 49430, 98.9 ps). step 49430, will finish Wed Sep 12 16:09:33 2012 step 5, will finish Thu May 24 11:23:04 2012 Step 47546: resetting all time and cycle counters = PBS: job killed: walltime 604823 exceeded limit 604800 Terminated ** Apparently, the job runs for one week on a computer cluster (that is the maximum time allowed), but it does not progress very much beyond step 49430. Also the log-file does not show any more steps: Step Time Lambda 46455 92.91000 0.0 Grid: 18 x 17 x 25 cells Energies (kJ/mol) G96Angle Proper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.83095e+04 3.70277e+04 2.14102e+03 8.83853e+03 -7.33070e+02 Coulomb-14 LJ (SR) LJ (LR) Disper. corr. Coulomb (SR) 2.29503e+05 3.04138e+05 -2.66781e+04 -8.51221e+03 -2.74692e+06 Coul. recip. Position Rest. Potential Kinetic En. Total Energy -9.59421e+05 5.41532e+03 -3.09689e+06 5.18959e+05 -2.57793e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97550e+02 -1.14933e+02 5.41944e+01 0.0e+00 Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012 --- Restarting from checkpoint, appending to previous log file. ... ... Started mdrun on node 0 Tue Mar 6 16:40:10 2012 Step Time Lambda 49430 98.86000 0.0 Grid: 18 x 17 x 25 cells Energies (kJ/mol) G96Angle Proper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.84241e+04 3.69121e+04 2.09533e+03 8.80916e+03 -4.67086e+02 Coulomb-14 LJ (SR) LJ (LR) Disper. corr. Coulomb (SR) 2.29528e+05 2.99825e+05 -2.67028e+04 -8.51334e+03 -2.74410e+06 Coul. recip. Position Rest. Potential Kinetic En. Total Energy -9.59506e+05 5.47116e+03 -3.09823e+06 5.18993e+05 -2.57923e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97570e+02 -1.14963e+02 2.67842e+00 0.0e+00 Step Time Lambda 5 100.0 0.0 Energies (kJ/mol) G96Angle Proper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.86161e+04 3.71585e+04 2.15336e+03 8.92946e+03 -4.84684e+02 Coulomb-14 LJ (SR) LJ (LR) Disper. corr. Coulomb (SR) 2.29950e+05 3.01014e+05 -2.66724e+04 -8.51306e+03 -2.74349e+06 Coul. recip. Position Rest. Potential Kinetic En. Total Energy -9.59537e+05 5.56712e+03 -3.09531e+06 5.19371e+05 -2.57594e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97787e+02 -1.14956e+02 2.36460e+01 1.50068e-05 [End of log-file] *** I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem or maybe a problem of the computer cluster (MPI, etc). I would be grateful for any help. Many thanks Andreas-- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
Re: [gmx-users] continue replica exchange MD
22 mar 2012 kl. 08.05 skrev lina: On Wed, Mar 21, 2012 at 11:36 PM, Kukol, Andreas a.ku...@herts.ac.uk wrote: Hello, Upon continuing a replica exchange MD simulation using the command mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 1 -cpt 60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e edrRemd_20ns.edr -stepout 2000 From my side, I have no problem resuming or extending the REMD simulations in V.4.5.5 and 4.5.4 I've had problems with continuation of REMD simulations with gmx 4.5.5 (although manifested differently IIRC). The release-4-5-patches contain bugfixes that solved the problems. I suggest trying the patches. Erik Here is the command: mdrun_g_f -s md_.tpr -multi 32 -replex 500 -cpi state_.cpt -append I use state_.cpt, not state.cpt I get the following output: ** ... ... 500 steps, 1.0 ps (continuing from step 49430, 98.9 ps). 500 steps, 1.0 ps (continuing from step 49430, 98.9 ps). step 49430, will finish Wed Sep 12 16:09:33 2012 step 5, will finish Thu May 24 11:23:04 2012 Step 47546: resetting all time and cycle counters = PBS: job killed: walltime 604823 exceeded limit 604800 Terminated ** Apparently, the job runs for one week on a computer cluster (that is the maximum time allowed), but it does not progress very much beyond step 49430. Also the log-file does not show any more steps: Step Time Lambda 46455 92.910000.0 Grid: 18 x 17 x 25 cells Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.83095e+043.70277e+042.14102e+038.83853e+03 -7.33070e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29503e+053.04138e+05 -2.66781e+04 -8.51221e+03 -2.74692e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59421e+055.41532e+03 -3.09689e+065.18959e+05 -2.57793e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97550e+02 -1.14933e+025.41944e+010.0e+00 Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012 --- Restarting from checkpoint, appending to previous log file. ... ... Started mdrun on node 0 Tue Mar 6 16:40:10 2012 Step Time Lambda 49430 98.860000.0 Grid: 18 x 17 x 25 cells Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.84241e+043.69121e+042.09533e+038.80916e+03 -4.67086e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29528e+052.99825e+05 -2.67028e+04 -8.51334e+03 -2.74410e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59506e+055.47116e+03 -3.09823e+065.18993e+05 -2.57923e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97570e+02 -1.14963e+022.67842e+000.0e+00 Step Time Lambda 5 100.00.0 Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.86161e+043.71585e+042.15336e+038.92946e+03 -4.84684e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29950e+053.01014e+05 -2.66724e+04 -8.51306e+03 -2.74349e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59537e+055.56712e+03 -3.09531e+065.19371e+05 -2.57594e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97787e+02 -1.14956e+022.36460e+011.50068e-05 [End of log-file] *** I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem or maybe a problem of the computer cluster (MPI, etc). I would be grateful for any help. Many thanks Andreas-- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read
[gmx-users] continue replica exchange MD
Hello, Upon continuing a replica exchange MD simulation using the command mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 1 -cpt 60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e edrRemd_20ns.edr -stepout 2000 I get the following output: ** ... ... 500 steps, 1.0 ps (continuing from step 49430, 98.9 ps). 500 steps, 1.0 ps (continuing from step 49430, 98.9 ps). step 49430, will finish Wed Sep 12 16:09:33 2012 step 5, will finish Thu May 24 11:23:04 2012 Step 47546: resetting all time and cycle counters = PBS: job killed: walltime 604823 exceeded limit 604800 Terminated ** Apparently, the job runs for one week on a computer cluster (that is the maximum time allowed), but it does not progress very much beyond step 49430. Also the log-file does not show any more steps: Step Time Lambda 46455 92.910000.0 Grid: 18 x 17 x 25 cells Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.83095e+043.70277e+042.14102e+038.83853e+03 -7.33070e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29503e+053.04138e+05 -2.66781e+04 -8.51221e+03 -2.74692e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59421e+055.41532e+03 -3.09689e+065.18959e+05 -2.57793e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97550e+02 -1.14933e+025.41944e+010.0e+00 Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012 --- Restarting from checkpoint, appending to previous log file. ... ... Started mdrun on node 0 Tue Mar 6 16:40:10 2012 Step Time Lambda 49430 98.860000.0 Grid: 18 x 17 x 25 cells Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.84241e+043.69121e+042.09533e+038.80916e+03 -4.67086e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29528e+052.99825e+05 -2.67028e+04 -8.51334e+03 -2.74410e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59506e+055.47116e+03 -3.09823e+065.18993e+05 -2.57923e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97570e+02 -1.14963e+022.67842e+000.0e+00 Step Time Lambda 5 100.00.0 Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.86161e+043.71585e+042.15336e+038.92946e+03 -4.84684e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29950e+053.01014e+05 -2.66724e+04 -8.51306e+03 -2.74349e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59537e+055.56712e+03 -3.09531e+065.19371e+05 -2.57594e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97787e+02 -1.14956e+022.36460e+011.50068e-05 [End of log-file] *** I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem or maybe a problem of the computer cluster (MPI, etc). I would be grateful for any help. Many thanks Andreas-- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
Re: [gmx-users] continue replica exchange MD
On 22/03/2012 2:36 AM, Kukol, Andreas wrote: Hello, Upon continuing a replica exchange MD simulation using the command mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 1 -cpt 60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e edrRemd_20ns.edr -stepout 2000 I get the following output: ** ... ... 500 steps, 1.0 ps (continuing from step 49430, 98.9 ps). 500 steps, 1.0 ps (continuing from step 49430, 98.9 ps). step 49430, will finish Wed Sep 12 16:09:33 2012 step 5, will finish Thu May 24 11:23:04 2012 Step 47546: resetting all time and cycle counters = PBS: job killed: walltime 604823 exceeded limit 604800 Terminated ** Apparently, the job runs for one week on a computer cluster (that is the maximum time allowed), but it does not progress very much beyond step 49430. Also the log-file does not show any more steps: Step Time Lambda 46455 92.910000.0 Grid: 18 x 17 x 25 cells Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.83095e+043.70277e+042.14102e+038.83853e+03 -7.33070e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29503e+053.04138e+05 -2.66781e+04 -8.51221e+03 -2.74692e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59421e+055.41532e+03 -3.09689e+065.18959e+05 -2.57793e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97550e+02 -1.14933e+025.41944e+010.0e+00 Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012 --- Restarting from checkpoint, appending to previous log file. ... ... Started mdrun on node 0 Tue Mar 6 16:40:10 2012 Step Time Lambda 49430 98.860000.0 Grid: 18 x 17 x 25 cells Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.84241e+043.69121e+042.09533e+038.80916e+03 -4.67086e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29528e+052.99825e+05 -2.67028e+04 -8.51334e+03 -2.74410e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59506e+055.47116e+03 -3.09823e+065.18993e+05 -2.57923e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97570e+02 -1.14963e+022.67842e+000.0e+00 Step Time Lambda 5 100.00.0 Energies (kJ/mol) G96AngleProper Dih. Ryckaert-Bell. Improper Dih. LJ-14 5.86161e+043.71585e+042.15336e+038.92946e+03 -4.84684e+02 Coulomb-14LJ (SR)LJ (LR) Disper. corr. Coulomb (SR) 2.29950e+053.01014e+05 -2.66724e+04 -8.51306e+03 -2.74349e+06 Coul. recip. Position Rest. PotentialKinetic En. Total Energy -9.59537e+055.56712e+03 -3.09531e+065.19371e+05 -2.57594e+06 Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd 2.97787e+02 -1.14956e+022.36460e+011.50068e-05 [End of log-file] *** I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem or maybe a problem of the computer cluster (MPI, etc). I would be grateful for any help. It's almost certainly hanging at the next replica-exchange attempt. GROMACS version? Same between original run and restart? Do the times in the .cpt files match - check with gmxdump? If not, you need a matching set of .cpt files by using some of the state_prev.cpt files. Back everything up and copy and rename .cpt files to make it work. Mark -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists