[gmx-users] Question regarding tpr files and rmsd

2010-08-12 Thread Bernhard Knapp

Dear users

Due to a hard disk crash we lost several md simulations. Fortunalty we  
have backup copies of the trajectry files (xtc format) and structure 
files of the first frame of the simulation (created via trjconv -b 0 -e  
0 -f myName.md.trr -o myName.md.firstframe.pdb  -s myName.md.tpr). We 
do  not have the tpr files which we usually used for example for the 
rmsd  calculations. We found that the -s option of g_rms does not only 
take  tpr files but also pdb files, however when I compare the resulting 
xvg  files the values are slightly different and I get a warning 
Warning: if  there are broken molecules in the trajectory file, they 
can not be made  whole without a run input file. The average difference 
(over 10 ns)  between the xvg file based on the tpr and on the pdb is  
0.01447971.

Example for the 2 file:

[bkn...@quovadis02 test]$ sdiff rmsd.pdb.xvg rmsd.tpr.xvg | less
# This file was created Thu Aug 12 11:23:09 2010  | # This 
file was created Thu Aug 12 11:22:06 2010
# by the following command: # by the 
following command:
# g_rms -f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.firstframe.pdb -o r | # g_rms 
-f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.tpr -o rmsd.tpr.xvg

#   #
# g_rms is part of G R O M A C S:   # g_rms 
is part of G R O M A C S:

#   #
# GROtesk MACabre and Sinister| # S  C  
A  M  O  R  G

#   #
@title RMSD   @
title RMSD
@xaxis  label Time (ps)   @
xaxis  label Time (ps)
@yaxis  label RMSD (nm)   @
yaxis  label RMSD (nm)

@TYPE xy@TYPE xy
@ subtitle Protein after lsq fit to Protein   @ 
subtitle Protein after lsq fit to Protein
  0.0000.0005041 |
0.0000.0005046
  3.0000.1072081 |
3.0000.0981387
  6.0000.1281023 |
6.0000.1207779
  9.0000.1452615 |
9.0000.1351306

...


My questions are now:

- Why are the xvg files different if they are based on the tpr and on 
the pdb file?


- What is the more appropriate way to calculate the rmsd?

- Just if the more appropriate way is the tpr file: Is it valid to 
recreate the tpr file via grompp solely on the firstframe.pdb and the 
xtc of the trajectory? eg via
pdb2gmx -f 1mi5_A6D.md.firstframe.pdb -o 1mi5_A6D.md.firstframe.pdb.gro 
-p 1mi5_A6D.md.top
editconf -f 1mi5_A6D.md.firstframe.pdb.gro -o 
1mi5_A6D.firstframe.cube.pdb -bt cubic -d 2.0
genbox -cp 1mi5_A6D.firstframe.cube.pdb -cs spc216.gro -o 
1mi5_A6D.firstframe.water.pdb -p 1mi5_A6D.md.top
grompp -f md.mdp -c 1mi5_A6D.firstframe.water.pdb -p 1mi5_A6D.md.top -o 
1mi5_A6D.md.RECREATED.tpr
then the average difference between the xvg file based on the tpr and 
the recreated tpr is 1.4865E-05 (which is much more similar however 
still not identical)


example:
[bkn...@quovadis02 test]$ sdiff rmsd.tprRECREATED.xvg rmsd.tpr.xvg | less
# This file was created Thu Aug 12 11:59:23 2010  | # This 
file was created Thu Aug 12 11:22:06 2010
# by the following command: # by the 
following command:
# g_rms -f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.RECREATED.tpr -o rm | # g_rms 
-f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.tpr -o rmsd.tpr.xvg

#   #
# g_rms is part of G R O M A C S:   # g_rms 
is part of G R O M A C S:

#   #
# GROup of MAchos and Cynical Suckers | # S  C  
A  M  O  R  G

#   #
@title RMSD   @
title RMSD
@xaxis  label Time (ps)   @
xaxis  label Time (ps)
@yaxis  label RMSD (nm)   @
yaxis  label RMSD (nm)

@TYPE xy@TYPE xy
@ subtitle Protein after lsq fit to Protein   @ 
subtitle Protein after lsq fit to Protein
  0.0000.0022253 |
0.0000.0005046
  3.0000.0981661 |
3.0000.0981387
  6.0000.1207914 |
6.0000.1207779
  9.0000.1351781 |
9.0000.1351306

...


cheers
Bernhard



--
gmx-users mailing list

Re: [gmx-users] Question regarding tpr files and rmsd

2010-08-12 Thread Mark Abraham


- Original Message -
From: Bernhard Knapp bernhard.kn...@meduniwien.ac.at
Date: Thursday, August 12, 2010 23:28
Subject: [gmx-users] Question regarding tpr files and rmsd
To: gmx-users@gromacs.org

 Dear users
 
 Due to a hard disk crash we lost several md simulations. 
 Fortunalty we  have backup copies of the trajectry files 
 (xtc format) and structure files of the first frame of the 
 simulation (created via trjconv -b 0 -e  0 -f myName.md.trr 
 -o myName.md.firstframe.pdb  -s myName.md.tpr). We do  
 not have the tpr files which we usually used for example for the 
 rmsd  calculations. We found that the -s option of g_rms 
 does not only take  tpr files but also pdb files, however 
 when I compare the resulting xvg  files the values are 
 slightly different and I get a warning Warning: if  there 
 are broken molecules in the trajectory file, they can not be 
 made  whole without a run input file.

Sure. GROMACS-written PDB files contain no connectivity data. .tpr files do.

 The average 
 difference (over 10 ns)  between the xvg file based on the 
 tpr and on the pdb is  0.01447971.

Probably quite a reasonable difference in the average RMSD (in nm), given the 
pdb approximates the floating-point numbers in the .tpr with only 3 decimal 
points (in Angstrom)

 Example for the 2 file:
 
 [bkn...@quovadis02 test]$ sdiff rmsd.pdb.xvg rmsd.tpr.xvg | less
 # This file was created Thu Aug 12 11:23:09 
 2010  | # This file was created Thu Aug 12 11:22:06 2010
 # by the following 
 command: # by the following command:
 # g_rms -f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.firstframe.pdb -o r | 
 # g_rms -f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.tpr -o rmsd.tpr.xvg
 #   #
 # g_rms is part of G R O M A C 
 S:   # g_rms is part of G R O M A C S:
 #   #
 # GROtesk MACabre and 
 Sinister| # S  C  A  M  O  R  G
 #   #
 @title 
 RMSD   @title RMSD
 @xaxis  label Time 
 (ps)   @xaxis  label Time (ps)
 @yaxis  label RMSD 
 (nm)   @yaxis  label RMSD (nm)
 @TYPE 
 xy@TYPE xy
 @ subtitle Protein after lsq fit to 
 Protein   @ subtitle Protein after lsq fit to Protein
   0.000
 0.0005041 |0.0000.0005046
   3.000
 0.1072081 |3.0000.0981387
   6.000
 0.1281023 |6.0000.1207779
   9.000
 0.1452615 |9.0000.1351306
 ...
 
 
 My questions are now:
 
 - Why are the xvg files different if they are based on the tpr 
 and on the pdb file?

PDB has more limited precision than the .tpr, and if you're fitting to their 
structure, you'll get slightly different results.

 - What is the more appropriate way to calculate the rmsd?

Depends what you're trying to measure - but I'd argue there shouldn't be a 
significant difference between these two.
 
 - Just if the more appropriate way is the tpr file: Is it valid 
 to recreate the tpr file via grompp solely on the firstframe.pdb 
 and the xtc of the trajectory? eg via

It's about the best you can do. IIRC, the firstframe.pdb will have higher 
precision than the first frame of the .xtc, typically, and so produce a .tpr 
closer to the original.

The procedure below is unnecessarily regenerating water. Use pdb2gmx to 
generate the .top, and then simply use grompp to combine .mdp, .top and 
whichever source of the first frame you choose.

 pdb2gmx -f 1mi5_A6D.md.firstframe.pdb -o 
 1mi5_A6D.md.firstframe.pdb.gro -p 1mi5_A6D.md.top
 editconf -f 1mi5_A6D.md.firstframe.pdb.gro -o 
 1mi5_A6D.firstframe.cube.pdb -bt cubic -d 2.0
 genbox -cp 1mi5_A6D.firstframe.cube.pdb -cs spc216.gro -o 
 1mi5_A6D.firstframe.water.pdb -p 1mi5_A6D.md.top
 grompp -f md.mdp -c 1mi5_A6D.firstframe.water.pdb -p 
 1mi5_A6D.md.top -o 1mi5_A6D.md.RECREATED.tpr
 then the average difference between the xvg file based on the 
 tpr and the recreated tpr is 1.4865E-05 (which is much more 
 similar however still not identical)

You fitted and analyzed with respect to Protein, so the re-generation of the 
water should have no effect here. The remaining difference should be 
attributable to the difference between whatever coordinate changes pdb2gmx is 
making (I can't tell what, if any) based on the two slightly different starting 
points - but subjecting both to rounding to .gro precision. This last point 
explains why they're so similar, I expect.

Mark
  
 example:
 [bkn...@quovadis02 test]$ sdiff rmsd.tprRECREATED.xvg 
 rmsd.tpr.xvg