Re: [gmx-users] Energy/temperature drifts in Gromacs 4.0 / inconsistencies with Gromacs 3.3.1

2009-03-14 Thread David van der Spoel

Pietro Amodeo wrote:

ADDENDUM: I've just compiled Gromacs 3.3.3 on cluster NEW, with Intel
compiler (see below). A simulation (still running) on system 2 with the
paralled double-prec version shows preliminary results in line with
gromacs 3.3.1 on cluster OLD, with no trace of abnormal oscillations and
drifts in temperature or energy.


Please open a bugzilla and upload the different tpr files. If they 
indeed crash after short time it should be possible to see what is going on.



2) Cluster: NEW(Infiniband)

   (CentOS 5)
   kernel 2.6.18-53.el5
   icc 10.1 (Build 20070913 Pack.ID: l_cc_p_10.1.008)
   fftw 3.2.1
   ofed131 - openmpi 1.2.6


Best,
Pietro



--
David van der Spoel, Ph.D., Professor of Biology
Molec. Biophys. group, Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone:  +46184714205. Fax: +4618511755.
sp...@xray.bmc.uu.sesp...@gromacs.org   http://folding.bmc.uu.se
___
gmx-users mailing listgmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php


Re: [gmx-users] Energy/temperature drifts in Gromacs 4.0 / inconsistencies with Gromacs 3.3.1

2009-03-13 Thread Pietro Amodeo
ADDENDUM: I've just compiled Gromacs 3.3.3 on cluster NEW, with Intel
compiler (see below). A simulation (still running) on system 2 with the
paralled double-prec version shows preliminary results in line with
gromacs 3.3.1 on cluster OLD, with no trace of abnormal oscillations and
drifts in temperature or energy.

> 2) Cluster: NEW(Infiniband)
>>(CentOS 5)
>>kernel 2.6.18-53.el5
>>icc 10.1 (Build 20070913 Pack.ID: l_cc_p_10.1.008)
>>fftw 3.2.1
>>ofed131 - openmpi 1.2.6

Best,
Pietro
-- 
Dr. Pietro Amodeo, Ph.D.
Istituto di Chimica Biomolecolare del CNR
Comprensorio "A. Olivetti", Edificio 70
Via Campi Flegrei 34
I-80078 Pozzuoli (Napoli) - Italy
Phone  +39-0818675072
Fax+39-0818041770
Emailpamo...@icmib.na.cnr.it

___
gmx-users mailing listgmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/mailing_lists/users.php


Re: [gmx-users] Energy/temperature drifts in Gromacs 4.0 / inconsistencies with Gromacs 3.3.1

2009-03-13 Thread Pietro Amodeo
> That was a long mail. How about T-coupling? Which algorithm did you use?

Sorry for the long mail. For T-coupling I used Berendsen al gorithm.

> Did you do a diff on the md.log to check for differences in the mdp
> parameters?

Yes, I attach the diff file (emended of references and other comments), as
well as the run parameters reported in the NEW md log file.
For system 1 I tried both the same tpr file (from 3.3.1) in 3.3.1 and
4.0.4 simulations, and a new 4.0.4 format tpr file with grompp for 4.0.4
md run, but all simulations stopped in about 2,000 steps.
System 2, for which I attached the different parts of the log files, was
simulated using the same tpr file (from 3.3.1) for both 3.3.1 and 4.0.4 MD
runs.

> Did you run these in parallel? What happens when you run it
> sequentially? And what happens in single precision?

For system 1, changing precision, compiler or serial-vs-parallel run only
affect the exact step at which the simulation stops. Also for system 2
these parameters do not affect the overall oscillating behaviour of the
simulation.

Best,
Pietro

-- 
Dr. Pietro Amodeo, PhD.
Istituto di Chimica Biomolecolare del CNR
Comprensorio "A. Olivetti", Edificio 70
Via Campi Flegrei 34
I-80078 Pozzuoli (Napoli) - Italy
Phone  +39-0818675072
Fax+39-0818041770
Emailpamo...@icmib.na.cnr.it11c11
< :-)  VERSION 3.3.1  (-:
---
> :-)  VERSION 4.0.4  (-:
40,65d55
< CPU=  0, lastcg= 1461, targetcg= 6589, myshift=5
< CPU=  1, lastcg= 2717, targetcg= 7845, myshift=5
< CPU=  2, lastcg= 3973, targetcg= 9101, myshift=5
< CPU=  3, lastcg= 5229, targetcg=  102, myshift=5
< CPU=  4, lastcg= 6485, targetcg= 1358, myshift=4
< CPU=  5, lastcg= 7740, targetcg= 2612, myshift=4
< CPU=  6, lastcg= 8998, targetcg= 3870, myshift=4
< CPU=  7, lastcg=10255, targetcg= 5128, myshift=4
< nsb->shift =   5, nsb->bshift=  0
< Listing Scalars
< nsb->nodeid: 0
< nsb->nnodes:  8
< nsb->cgtotal: 10256
< nsb->natoms:  30126
< nsb->shift:   5
< nsb->bshift:  0
< Nodeid   index  homenr  cgload  workload
<  0   037661462  1462
<  1376637662718  2718
<  2753237663974  3974
<  3   1129837665230  5230
<  4   1506437666486  6486
<  5   1883037657741  7741
<  6   2259537668999  8999
<  7   263613765   10256 10256
< 
73,74d62
bPeriodicMols= FALSE
>bContinuation= FALSE
109a98,106
>refcoord_scaling = No
>posres_com (3):
>   posres_com[0]= 0.0e+00
>   posres_com[1]= 0.0e+00
>   posres_com[2]= 0.0e+00
>posres_comB (3):
>   posres_comB[0]= 0.0e+00
>   posres_comB[1]= 0.0e+00
>   posres_comB[2]= 0.0e+00
111a109
>rtpi = 0.05
120a119
>implicit_solvent = No
121a121
>gb_epsilon_solvent   = 80
125c125,128
gb_obc_alpha = 1
>gb_obc_beta  = 0.8
>gb_obc_gamma = 4.85
>sa_surface_tension   = 2.092
127d129
nwall= 0
>wall_type= 9-3
>wall_atomtype[0] = -1
>wall_atomtype[1] = -1
>wall_density[0]  = 0
>wall_density[1]  = 0
>wall_ewald_zfac  = 3
>pull = no
>disre= No
143,144d153
shake_tol= 0.0001
199,216d207
< Max number of graph edges per atom is 4
< Table routines are used for coulomb: TRUE
< Table routines are used for vdw: FALSE
< Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
< Cut-off's:   NS: 1   Coulomb: 1   LJ: 1.4
< System total charge: 0.000
< Generated table with 1200 data points for Ewald.
< Tabscale = 500 points/nm
< Generated table with 1200 data points for LJ6.
< Tabscale = 500 points/nm
< Generated table with 1200 data points for LJ12.
< Tabscale = 500 points/nm
< Generated table with 500 data points for 1-4 COUL.
< Tabscale = 500 points/nm
< Generated table with 500 data points for 1-4 LJ6.
< Tabscale = 500 points/nm
< Generated table with 500 data points for 1-4 LJ12.
< Tabscale = 500 points/nm
218c209,224
< Enabling SPC water optimization for 9451 molecules.
---
> Initializing Domain Decomposition on 8 nodes
> Dynamic load balancing: auto
> Will sort the charge groups at every domain (re)decomposition
> Initial maximum inter charge-group distances:
> two-body bonded interactions: 0.614 nm, LJ-14, atoms 1485 1492
>   multi-body bonded interactions: 0.614 nm, Proper Dih., atoms 1485 1492
> Minimum cell size due to bonded interactions: 0.675 nm
> Maximum dist

Re: [gmx-users] Energy/temperature drifts in Gromacs 4.0 / inconsistencies with Gromacs 3.3.1

2009-03-13 Thread David van der Spoel

Pietro Amodeo wrote:

Hi Gromacs users/developers,

we have two Gromacs installations on two different clusters with the
following sw versions:

1) Cluster: OLD(Myrinet)
   Gromacs 3.3.1
   (CentOS 4 / Rocks 4.1)
   kernel 2.6.9-22.ELsmp
   gcc 3.4.4
   fftw 3.1.2
   mpich-gm 1.2.7p1..18

2) Cluster: NEW(Infiniband)
   Gromacs 4.0.4 / 4.0.3
   (CentOS 5)
   kernel 2.6.18-53.el5
   gcc 4.1.2 20070626 (Red Hat 4.1.2-14) / icc 10.1 (Build 20070913
Pack.ID: l_cc_p_10.1.008)
   fftw 3.2.1
   ofed131 - openmpi 1.2.6

Both serial and parallel, both single- and double-precision versions of
Gromacs 4.0.3 and 4.0.4 were compiled with gcc (deprecated 4.1.2, but
tests were either passed or failed with minor discrepancies) and with
Intel 10.1 compilers).

We tried to reproduce on cluster NEW simple MD equilibrations on two
different systems (proteins solvated in SPC water + counterions)
successfully run on cluster OLD.  We used as starting tpr files either the
same ones used and produced in 3.3.1, or new 4.0.4 files.
Although the starting energies for both systems were substantially equal:
--
Cluster NEW system 2:
   Step   Time Lambda
  00.00.0

   Energies (kJ/mol)
   G96AngleProper Dih.  Improper Dih.  LJ-14 Coulomb-14
7.90343e+027.80369e+022.11086e+024.58020e+021.92904e+04
LJ (SR)LJ (LR)   Coulomb (SR)   Coul. recip. Position Rest.
1.09286e+05   -1.43221e+03   -4.54033e+05   -5.15749e+042.74030e-01
  PotentialKinetic En.   Total EnergyTemperature Pressure (bar)
   -3.76224e+057.83268e+04   -2.97897e+053.12792e+029.77797e+03
  Cons. rmsd ()
2.19464e-05
--
Cluster OLD system 2:
   Rel. Constraint Deviation:  Maxbetween atoms RMS
   Before LINCS 0.098014   1670   1671   0.006831
After LINCS 0.000104509511   0.22

   Energies (kJ/mol)
   G96AngleProper Dih.  Improper Dih.  LJ-14 Coulomb-14
7.90348e+027.80369e+022.11085e+024.58017e+021.92904e+04
LJ (SR)LJ (LR)   Coulomb (SR)   Coul. recip. Position Rest.
1.09286e+05   -1.43221e+03   -4.54033e+05   -5.15750e+042.74027e-01
  PotentialKinetic En.   Total EnergyTemperature Pressure (bar)
   -3.76224e+057.83267e+04   -2.97897e+053.12791e+029.78296e+03
--
in both cases the simulations with Gromacs 3.3.1 ran without any problem
(and provided good starting points for very stable production runs), while
those performed with Gromacs 4.0.3 or 4.0.4 after 2 ps or less
systematically started exhibiting total energy and temperature wide
oscillations with a net increasing drift in energy on both systems, and
very rapidly increasing temperature variations in system 1, that led to
premature run terminations with errors on LINCS or routines to calculate
1-4 interactions for all runs on system 1. System 2 exhibited a smaller
energy drift and rather steady, but still significant, temperature
oscillations, so the 100 ps run (8 cores, double-precision parallel
version complied with Intel compiler, starting from original 3.3.1 tpr
file) ended (apparently) regularly.
However, avg. energy was higher than in corresponding 3.3.1 simulation and
avg. temperature failed to reach the targeted 300K value. In particular
protein suffered from poor thermal relaxation under the same conditions
that in 3.3.1 simulations worked flawlessly.
The final, average and r.m.s. values from log files of the two
corresponding runs on system 2 with 3.3.1 and 4.0.4 are:


Cluster NEW system 2:
   Step   Time Lambda
  5  100.00.0

Writing checkpoint, step 5 at Thu Mar 12 16:21:03 2009

   Energies (kJ/mol)
   G96AngleProper Dih.  Improper Dih.  LJ-14 Coulomb-14
5.20200e+031.40303e+031.56905e+035.80966e+021.89786e+04
LJ (SR)LJ (LR)   Coulomb (SR)   Coul. recip. Position Rest.
7.93006e+04   -1.43727e+03   -4.96019e+05   -6.18250e+042.12927e+03
  PotentialKinetic En.   Total EnergyTemperature Pressure (bar)
   -4.50118e+058.53539e+04   -3.64764e+053.40853e+027.37194e+03
  Cons. rmsd ()
6.33529e-05

<==  ###  ==>
<  A V E R A G E S  >
<==  ###  ==>

   Energies (kJ/mol)
   G96AngleProper Dih.  Improper Dih.  LJ-14 Coulomb-14
5.22068e+031.38036e+031.53871e+031.11820e+031.90723e+04
LJ (SR)LJ (LR)   Coulomb (

[gmx-users] Energy/temperature drifts in Gromacs 4.0 / inconsistencies with Gromacs 3.3.1

2009-03-13 Thread Pietro Amodeo
Hi Gromacs users/developers,

we have two Gromacs installations on two different clusters with the
following sw versions:

1) Cluster: OLD(Myrinet)
   Gromacs 3.3.1
   (CentOS 4 / Rocks 4.1)
   kernel 2.6.9-22.ELsmp
   gcc 3.4.4
   fftw 3.1.2
   mpich-gm 1.2.7p1..18

2) Cluster: NEW(Infiniband)
   Gromacs 4.0.4 / 4.0.3
   (CentOS 5)
   kernel 2.6.18-53.el5
   gcc 4.1.2 20070626 (Red Hat 4.1.2-14) / icc 10.1 (Build 20070913
Pack.ID: l_cc_p_10.1.008)
   fftw 3.2.1
   ofed131 - openmpi 1.2.6

Both serial and parallel, both single- and double-precision versions of
Gromacs 4.0.3 and 4.0.4 were compiled with gcc (deprecated 4.1.2, but
tests were either passed or failed with minor discrepancies) and with
Intel 10.1 compilers).

We tried to reproduce on cluster NEW simple MD equilibrations on two
different systems (proteins solvated in SPC water + counterions)
successfully run on cluster OLD.  We used as starting tpr files either the
same ones used and produced in 3.3.1, or new 4.0.4 files.
Although the starting energies for both systems were substantially equal:
--
Cluster NEW system 2:
   Step   Time Lambda
  00.00.0

   Energies (kJ/mol)
   G96AngleProper Dih.  Improper Dih.  LJ-14 Coulomb-14
7.90343e+027.80369e+022.11086e+024.58020e+021.92904e+04
LJ (SR)LJ (LR)   Coulomb (SR)   Coul. recip. Position Rest.
1.09286e+05   -1.43221e+03   -4.54033e+05   -5.15749e+042.74030e-01
  PotentialKinetic En.   Total EnergyTemperature Pressure (bar)
   -3.76224e+057.83268e+04   -2.97897e+053.12792e+029.77797e+03
  Cons. rmsd ()
2.19464e-05
--
Cluster OLD system 2:
   Rel. Constraint Deviation:  Maxbetween atoms RMS
   Before LINCS 0.098014   1670   1671   0.006831
After LINCS 0.000104509511   0.22

   Energies (kJ/mol)
   G96AngleProper Dih.  Improper Dih.  LJ-14 Coulomb-14
7.90348e+027.80369e+022.11085e+024.58017e+021.92904e+04
LJ (SR)LJ (LR)   Coulomb (SR)   Coul. recip. Position Rest.
1.09286e+05   -1.43221e+03   -4.54033e+05   -5.15750e+042.74027e-01
  PotentialKinetic En.   Total EnergyTemperature Pressure (bar)
   -3.76224e+057.83267e+04   -2.97897e+053.12791e+029.78296e+03
--
in both cases the simulations with Gromacs 3.3.1 ran without any problem
(and provided good starting points for very stable production runs), while
those performed with Gromacs 4.0.3 or 4.0.4 after 2 ps or less
systematically started exhibiting total energy and temperature wide
oscillations with a net increasing drift in energy on both systems, and
very rapidly increasing temperature variations in system 1, that led to
premature run terminations with errors on LINCS or routines to calculate
1-4 interactions for all runs on system 1. System 2 exhibited a smaller
energy drift and rather steady, but still significant, temperature
oscillations, so the 100 ps run (8 cores, double-precision parallel
version complied with Intel compiler, starting from original 3.3.1 tpr
file) ended (apparently) regularly.
However, avg. energy was higher than in corresponding 3.3.1 simulation and
avg. temperature failed to reach the targeted 300K value. In particular
protein suffered from poor thermal relaxation under the same conditions
that in 3.3.1 simulations worked flawlessly.
The final, average and r.m.s. values from log files of the two
corresponding runs on system 2 with 3.3.1 and 4.0.4 are:


Cluster NEW system 2:
   Step   Time Lambda
  5  100.00.0

Writing checkpoint, step 5 at Thu Mar 12 16:21:03 2009

   Energies (kJ/mol)
   G96AngleProper Dih.  Improper Dih.  LJ-14 Coulomb-14
5.20200e+031.40303e+031.56905e+035.80966e+021.89786e+04
LJ (SR)LJ (LR)   Coulomb (SR)   Coul. recip. Position Rest.
7.93006e+04   -1.43727e+03   -4.96019e+05   -6.18250e+042.12927e+03
  PotentialKinetic En.   Total EnergyTemperature Pressure (bar)
   -4.50118e+058.53539e+04   -3.64764e+053.40853e+027.37194e+03
  Cons. rmsd ()
6.33529e-05

<==  ###  ==>
<  A V E R A G E S  >
<==  ###  ==>

   Energies (kJ/mol)
   G96AngleProper Dih.  Improper Dih.  LJ-14 Coulomb-14
5.22068e+031.38036e+031.53871e+031.11820e+031.90723e+04
LJ (SR)LJ (LR)   Coulomb (SR)   Coul. recip. Posi