Re: [gmx-users] continue replica exchange MD

2012-03-22 Thread lina
On Wed, Mar 21, 2012 at 11:36 PM, Kukol, Andreas a.ku...@herts.ac.uk wrote:
 Hello,

 Upon continuing a replica exchange MD simulation using the command

 mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 1 
 -cpt 60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e 
 edrRemd_20ns.edr -stepout 2000

From my side, I have no problem resuming or extending the REMD
simulations in V.4.5.5 and 4.5.4

Here is the command:

mdrun_g_f -s md_.tpr -multi 32 -replex 500  -cpi state_.cpt -append

    I use state_.cpt, not state.cpt


 I get the following output:
 **
 ...
 ...
 500 steps,  1.0 ps (continuing from step 49430,     98.9 ps).
 500 steps,  1.0 ps (continuing from step 49430,     98.9 ps).

 step 49430, will finish Wed Sep 12 16:09:33 2012
 step 5, will finish Thu May 24 11:23:04 2012
 Step 47546: resetting all time and cycle counters

 = PBS: job killed: walltime 604823 exceeded limit 604800
 Terminated
 **

 Apparently, the job runs for one week on a computer cluster (that is the 
 maximum time allowed), but it does not progress very much beyond step 49430.

 Also the log-file does not show any more steps:
 
           Step           Time         Lambda
          46455       92.91000        0.0

 Grid: 18 x 17 x 25 cells
   Energies (kJ/mol)
       G96Angle    Proper Dih. Ryckaert-Bell.  Improper Dih.          LJ-14
    5.83095e+04    3.70277e+04    2.14102e+03    8.83853e+03   -7.33070e+02
     Coulomb-14        LJ (SR)        LJ (LR)  Disper. corr.   Coulomb (SR)
    2.29503e+05    3.04138e+05   -2.66781e+04   -8.51221e+03   -2.74692e+06
   Coul. recip. Position Rest.      Potential    Kinetic En.   Total Energy
   -9.59421e+05    5.41532e+03   -3.09689e+06    5.18959e+05   -2.57793e+06
    Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
    2.97550e+02   -1.14933e+02    5.41944e+01    0.0e+00

 Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012

 ---
 Restarting from checkpoint, appending to previous log file.
 ...
 ...
 Started mdrun on node 0 Tue Mar  6 16:40:10 2012

           Step           Time         Lambda
          49430       98.86000        0.0

 Grid: 18 x 17 x 25 cells
   Energies (kJ/mol)
       G96Angle    Proper Dih. Ryckaert-Bell.  Improper Dih.          LJ-14
    5.84241e+04    3.69121e+04    2.09533e+03    8.80916e+03   -4.67086e+02
     Coulomb-14        LJ (SR)        LJ (LR)  Disper. corr.   Coulomb (SR)
    2.29528e+05    2.99825e+05   -2.67028e+04   -8.51334e+03   -2.74410e+06
   Coul. recip. Position Rest.      Potential    Kinetic En.   Total Energy
   -9.59506e+05    5.47116e+03   -3.09823e+06    5.18993e+05   -2.57923e+06
    Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
    2.97570e+02   -1.14963e+02    2.67842e+00    0.0e+00

           Step           Time         Lambda
          5      100.0        0.0

   Energies (kJ/mol)
       G96Angle    Proper Dih. Ryckaert-Bell.  Improper Dih.          LJ-14
    5.86161e+04    3.71585e+04    2.15336e+03    8.92946e+03   -4.84684e+02
     Coulomb-14        LJ (SR)        LJ (LR)  Disper. corr.   Coulomb (SR)
    2.29950e+05    3.01014e+05   -2.66724e+04   -8.51306e+03   -2.74349e+06
   Coul. recip. Position Rest.      Potential    Kinetic En.   Total Energy
   -9.59537e+05    5.56712e+03   -3.09531e+06    5.19371e+05   -2.57594e+06
    Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
    2.97787e+02   -1.14956e+02    2.36460e+01    1.50068e-05
 [End of log-file]
 ***

 I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem 
 or maybe a problem of the computer cluster (MPI, etc). I would be grateful 
 for any help.

 Many thanks
 Andreas--
 gmx-users mailing list    gmx-users@gromacs.org
 http://lists.gromacs.org/mailman/listinfo/gmx-users
 Please search the archive at 
 http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
 Please don't post (un)subscribe requests to the list. Use the
 www interface or send it to gmx-users-requ...@gromacs.org.
 Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] continue replica exchange MD

2012-03-22 Thread Erik Marklund

22 mar 2012 kl. 08.05 skrev lina:

 On Wed, Mar 21, 2012 at 11:36 PM, Kukol, Andreas a.ku...@herts.ac.uk wrote:
 Hello,
 
 Upon continuing a replica exchange MD simulation using the command
 
 mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 1 
 -cpt 60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e 
 edrRemd_20ns.edr -stepout 2000
 
 From my side, I have no problem resuming or extending the REMD
 simulations in V.4.5.5 and 4.5.4

I've had problems with continuation of REMD simulations with gmx 4.5.5 
(although manifested differently IIRC). The release-4-5-patches contain 
bugfixes that solved the problems. I suggest trying the patches.

Erik

 
 Here is the command:
 
 mdrun_g_f -s md_.tpr -multi 32 -replex 500  -cpi state_.cpt -append
 
    I use state_.cpt, not state.cpt
 
 
 I get the following output:
 **
 ...
 ...
 500 steps,  1.0 ps (continuing from step 49430, 98.9 ps).
 500 steps,  1.0 ps (continuing from step 49430, 98.9 ps).
 
 step 49430, will finish Wed Sep 12 16:09:33 2012
 step 5, will finish Thu May 24 11:23:04 2012
 Step 47546: resetting all time and cycle counters
 
 = PBS: job killed: walltime 604823 exceeded limit 604800
 Terminated
 **
 
 Apparently, the job runs for one week on a computer cluster (that is the 
 maximum time allowed), but it does not progress very much beyond step 49430.
 
 Also the log-file does not show any more steps:
 
   Step   Time Lambda
  46455   92.910000.0
 
 Grid: 18 x 17 x 25 cells
   Energies (kJ/mol)
   G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
5.83095e+043.70277e+042.14102e+038.83853e+03   -7.33070e+02
 Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
2.29503e+053.04138e+05   -2.66781e+04   -8.51221e+03   -2.74692e+06
   Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
   -9.59421e+055.41532e+03   -3.09689e+065.18959e+05   -2.57793e+06
Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
2.97550e+02   -1.14933e+025.41944e+010.0e+00
 
 Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012
 
 ---
 Restarting from checkpoint, appending to previous log file.
 ...
 ...
 Started mdrun on node 0 Tue Mar  6 16:40:10 2012
 
   Step   Time Lambda
  49430   98.860000.0
 
 Grid: 18 x 17 x 25 cells
   Energies (kJ/mol)
   G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
5.84241e+043.69121e+042.09533e+038.80916e+03   -4.67086e+02
 Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
2.29528e+052.99825e+05   -2.67028e+04   -8.51334e+03   -2.74410e+06
   Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
   -9.59506e+055.47116e+03   -3.09823e+065.18993e+05   -2.57923e+06
Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
2.97570e+02   -1.14963e+022.67842e+000.0e+00
 
   Step   Time Lambda
  5  100.00.0
 
   Energies (kJ/mol)
   G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
5.86161e+043.71585e+042.15336e+038.92946e+03   -4.84684e+02
 Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
2.29950e+053.01014e+05   -2.66724e+04   -8.51306e+03   -2.74349e+06
   Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
   -9.59537e+055.56712e+03   -3.09531e+065.19371e+05   -2.57594e+06
Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
2.97787e+02   -1.14956e+022.36460e+011.50068e-05
 [End of log-file]
 ***
 
 I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem 
 or maybe a problem of the computer cluster (MPI, etc). I would be grateful 
 for any help.
 
 Many thanks
 Andreas--
 gmx-users mailing listgmx-users@gromacs.org
 http://lists.gromacs.org/mailman/listinfo/gmx-users
 Please search the archive at 
 http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
 Please don't post (un)subscribe requests to the list. Use the
 www interface or send it to gmx-users-requ...@gromacs.org.
 Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
 --
 gmx-users mailing listgmx-users@gromacs.org
 http://lists.gromacs.org/mailman/listinfo/gmx-users
 Please search the archive at 
 http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
 Please don't post (un)subscribe requests to the list. Use the
 www interface or send it to gmx-users-requ...@gromacs.org.
 Can't post? Read 

[gmx-users] continue replica exchange MD

2012-03-21 Thread Kukol, Andreas
Hello,

Upon continuing a replica exchange MD simulation using the command 

mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 1 -cpt 
60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e 
edrRemd_20ns.edr -stepout 2000

I get the following output:
**
...
...
500 steps,  1.0 ps (continuing from step 49430, 98.9 ps).
500 steps,  1.0 ps (continuing from step 49430, 98.9 ps).

step 49430, will finish Wed Sep 12 16:09:33 2012
step 5, will finish Thu May 24 11:23:04 2012
Step 47546: resetting all time and cycle counters

= PBS: job killed: walltime 604823 exceeded limit 604800
Terminated
**

Apparently, the job runs for one week on a computer cluster (that is the 
maximum time allowed), but it does not progress very much beyond step 49430.

Also the log-file does not show any more steps:

   Step   Time Lambda
  46455   92.910000.0

Grid: 18 x 17 x 25 cells
   Energies (kJ/mol)
   G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
5.83095e+043.70277e+042.14102e+038.83853e+03   -7.33070e+02
 Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
2.29503e+053.04138e+05   -2.66781e+04   -8.51221e+03   -2.74692e+06
   Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
   -9.59421e+055.41532e+03   -3.09689e+065.18959e+05   -2.57793e+06
Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
2.97550e+02   -1.14933e+025.41944e+010.0e+00

Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012

---
Restarting from checkpoint, appending to previous log file.
...
...
Started mdrun on node 0 Tue Mar  6 16:40:10 2012

   Step   Time Lambda
  49430   98.860000.0

Grid: 18 x 17 x 25 cells
   Energies (kJ/mol)
   G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
5.84241e+043.69121e+042.09533e+038.80916e+03   -4.67086e+02
 Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
2.29528e+052.99825e+05   -2.67028e+04   -8.51334e+03   -2.74410e+06
   Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
   -9.59506e+055.47116e+03   -3.09823e+065.18993e+05   -2.57923e+06
Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
2.97570e+02   -1.14963e+022.67842e+000.0e+00

   Step   Time Lambda
  5  100.00.0

   Energies (kJ/mol)
   G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
5.86161e+043.71585e+042.15336e+038.92946e+03   -4.84684e+02
 Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
2.29950e+053.01014e+05   -2.66724e+04   -8.51306e+03   -2.74349e+06
   Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
   -9.59537e+055.56712e+03   -3.09531e+065.19371e+05   -2.57594e+06
Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
2.97787e+02   -1.14956e+022.36460e+011.50068e-05
[End of log-file]
***

I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem or 
maybe a problem of the computer cluster (MPI, etc). I would be grateful for any 
help.

Many thanks
Andreas--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] continue replica exchange MD

2012-03-21 Thread Mark Abraham

On 22/03/2012 2:36 AM, Kukol, Andreas wrote:

Hello,

Upon continuing a replica exchange MD simulation using the command

mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 1 -cpt 
60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e 
edrRemd_20ns.edr -stepout 2000

I get the following output:
**
...
...
500 steps,  1.0 ps (continuing from step 49430, 98.9 ps).
500 steps,  1.0 ps (continuing from step 49430, 98.9 ps).

step 49430, will finish Wed Sep 12 16:09:33 2012
step 5, will finish Thu May 24 11:23:04 2012
Step 47546: resetting all time and cycle counters

=  PBS: job killed: walltime 604823 exceeded limit 604800
Terminated
**

Apparently, the job runs for one week on a computer cluster (that is the 
maximum time allowed), but it does not progress very much beyond step 49430.

Also the log-file does not show any more steps:

Step   Time Lambda
   46455   92.910000.0

Grid: 18 x 17 x 25 cells
Energies (kJ/mol)
G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
 5.83095e+043.70277e+042.14102e+038.83853e+03   -7.33070e+02
  Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
 2.29503e+053.04138e+05   -2.66781e+04   -8.51221e+03   -2.74692e+06
Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
-9.59421e+055.41532e+03   -3.09689e+065.18959e+05   -2.57793e+06
 Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
 2.97550e+02   -1.14933e+025.41944e+010.0e+00

Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012

---
Restarting from checkpoint, appending to previous log file.
...
...
Started mdrun on node 0 Tue Mar  6 16:40:10 2012

Step   Time Lambda
   49430   98.860000.0

Grid: 18 x 17 x 25 cells
Energies (kJ/mol)
G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
 5.84241e+043.69121e+042.09533e+038.80916e+03   -4.67086e+02
  Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
 2.29528e+052.99825e+05   -2.67028e+04   -8.51334e+03   -2.74410e+06
Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
-9.59506e+055.47116e+03   -3.09823e+065.18993e+05   -2.57923e+06
 Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
 2.97570e+02   -1.14963e+022.67842e+000.0e+00

Step   Time Lambda
   5  100.00.0

Energies (kJ/mol)
G96AngleProper Dih. Ryckaert-Bell.  Improper Dih.  LJ-14
 5.86161e+043.71585e+042.15336e+038.92946e+03   -4.84684e+02
  Coulomb-14LJ (SR)LJ (LR)  Disper. corr.   Coulomb (SR)
 2.29950e+053.01014e+05   -2.66724e+04   -8.51306e+03   -2.74349e+06
Coul. recip. Position Rest.  PotentialKinetic En.   Total Energy
-9.59537e+055.56712e+03   -3.09531e+065.19371e+05   -2.57594e+06
 Temperature Pres. DC (bar) Pressure (bar)   Constr. rmsd
 2.97787e+02   -1.14956e+022.36460e+011.50068e-05
[End of log-file]
***

I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem or 
maybe a problem of the computer cluster (MPI, etc). I would be grateful for any 
help.



It's almost certainly hanging at the next replica-exchange attempt. 
GROMACS version? Same between original run and restart?


Do the times in the .cpt files match - check with gmxdump? If not, you 
need a matching set of .cpt files by using some of the state_prev.cpt 
files. Back everything up and copy and rename .cpt files to make it work.


Mark
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/Support/Mailing_Lists