Re: [gmx-users] Question regarding tpr files and rmsd

2010-08-12 Thread Mark Abraham


- Original Message -
From: Bernhard Knapp 
Date: Thursday, August 12, 2010 23:28
Subject: [gmx-users] Question regarding tpr files and rmsd
To: gmx-users@gromacs.org

> Dear users
> 
> Due to a hard disk crash we lost several md simulations. 
> Fortunalty we  have backup copies of the trajectry files 
> (xtc format) and structure files of the first frame of the 
> simulation (created via trjconv -b 0 -e  0 -f myName.md.trr 
> -o myName.md.firstframe.pdb  -s myName.md.tpr). We do  
> not have the tpr files which we usually used for example for the 
> rmsd  calculations. We found that the -s option of g_rms 
> does not only take  tpr files but also pdb files, however 
> when I compare the resulting xvg  files the values are 
> slightly different and I get a warning "Warning: if  there 
> are broken molecules in the trajectory file, they can not be 
> made  whole without a run input file".

Sure. GROMACS-written PDB files contain no connectivity data. .tpr files do.

> The average 
> difference (over 10 ns)  between the xvg file based on the 
> tpr and on the pdb is  0.01447971.

Probably quite a reasonable difference in the average RMSD (in nm), given the 
pdb approximates the floating-point numbers in the .tpr with only 3 decimal 
points (in Angstrom)

> Example for the 2 file:
> 
> [bkn...@quovadis02 test]$ sdiff rmsd.pdb.xvg rmsd.tpr.xvg | less
> # This file was created Thu Aug 12 11:23:09 
> 2010  | # This file was created Thu Aug 12 11:22:06 2010
> # by the following 
> command: # by the following command:
> # g_rms -f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.firstframe.pdb -o r | 
> # g_rms -f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.tpr -o rmsd.tpr.xvg
> #   #
> # g_rms is part of G R O M A C 
> S:   # g_rms is part of G R O M A C S:
> #   #
> # GROtesk MACabre and 
> Sinister| # S  C  A  M  O  R  G
> #   #
> @title 
> "RMSD"   @title "RMSD"
> @xaxis  label "Time 
> (ps)"   @xaxis  label "Time (ps)"
> @yaxis  label "RMSD 
> (nm)"   @yaxis  label "RMSD (nm)"
> @TYPE 
> xy@TYPE xy
> @ subtitle "Protein after lsq fit to 
> Protein"   @ subtitle "Protein after lsq fit to Protein"
>   0.000
> 0.0005041 |0.0000.0005046
>   3.000
> 0.1072081 |3.0000.0981387
>   6.000
> 0.1281023 |6.0000.1207779
>   9.000
> 0.1452615 |9.0000.1351306
> ...
> 
> 
> My questions are now:
> 
> - Why are the xvg files different if they are based on the tpr 
> and on the pdb file?

PDB has more limited precision than the .tpr, and if you're fitting to their 
structure, you'll get slightly different results.

> - What is the more appropriate way to calculate the rmsd?

Depends what you're trying to measure - but I'd argue there shouldn't be a 
significant difference between these two.
 
> - Just if the more appropriate way is the tpr file: Is it valid 
> to recreate the tpr file via grompp solely on the firstframe.pdb 
> and the xtc of the trajectory? eg via

It's about the best you can do. IIRC, the firstframe.pdb will have higher 
precision than the first frame of the .xtc, typically, and so produce a .tpr 
closer to the original.

The procedure below is unnecessarily regenerating water. Use pdb2gmx to 
generate the .top, and then simply use grompp to combine .mdp, .top and 
whichever source of the first frame you choose.

> pdb2gmx -f 1mi5_A6D.md.firstframe.pdb -o 
> 1mi5_A6D.md.firstframe.pdb.gro -p 1mi5_A6D.md.top
> editconf -f 1mi5_A6D.md.firstframe.pdb.gro -o 
> 1mi5_A6D.firstframe.cube.pdb -bt cubic -d 2.0
> genbox -cp 1mi5_A6D.firstframe.cube.pdb -cs spc216.gro -o 
> 1mi5_A6D.firstframe.water.pdb -p 1mi5_A6D.md.top
> grompp -f md.mdp -c 1mi5_A6D.firstframe.water.pdb -p 
> 1mi5_A6D.md.top -o 1mi5_A6D.md.RECREATED.tpr
> then the average difference between the xvg file based on the 
> tpr and the recreated tpr is 1.4865E-05 (which is much more 
> similar however still not identical)

You fitted and analyzed with respect to Protein, so the re-generation of the 
water should have no effect here. The

[gmx-users] Question regarding tpr files and rmsd

2010-08-12 Thread Bernhard Knapp

Dear users

Due to a hard disk crash we lost several md simulations. Fortunalty we  
have backup copies of the trajectry files (xtc format) and structure 
files of the first frame of the simulation (created via trjconv -b 0 -e  
0 -f myName.md.trr -o myName.md.firstframe.pdb  -s myName.md.tpr). We 
do  not have the tpr files which we usually used for example for the 
rmsd  calculations. We found that the -s option of g_rms does not only 
take  tpr files but also pdb files, however when I compare the resulting 
xvg  files the values are slightly different and I get a warning 
"Warning: if  there are broken molecules in the trajectory file, they 
can not be made  whole without a run input file". The average difference 
(over 10 ns)  between the xvg file based on the tpr and on the pdb is  
0.01447971.

Example for the 2 file:

[bkn...@quovadis02 test]$ sdiff rmsd.pdb.xvg rmsd.tpr.xvg | less
# This file was created Thu Aug 12 11:23:09 2010  | # This 
file was created Thu Aug 12 11:22:06 2010
# by the following command: # by the 
following command:
# g_rms -f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.firstframe.pdb -o r | # g_rms 
-f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.tpr -o rmsd.tpr.xvg

#   #
# g_rms is part of G R O M A C S:   # g_rms 
is part of G R O M A C S:

#   #
# GROtesk MACabre and Sinister| # S  C  
A  M  O  R  G

#   #
@title "RMSD"   @
title "RMSD"
@xaxis  label "Time (ps)"   @
xaxis  label "Time (ps)"
@yaxis  label "RMSD (nm)"   @
yaxis  label "RMSD (nm)"

@TYPE xy@TYPE xy
@ subtitle "Protein after lsq fit to Protein"   @ 
subtitle "Protein after lsq fit to Protein"
  0.0000.0005041 |
0.0000.0005046
  3.0000.1072081 |
3.0000.0981387
  6.0000.1281023 |
6.0000.1207779
  9.0000.1452615 |
9.0000.1351306

...


My questions are now:

- Why are the xvg files different if they are based on the tpr and on 
the pdb file?


- What is the more appropriate way to calculate the rmsd?

- Just if the more appropriate way is the tpr file: Is it valid to 
recreate the tpr file via grompp solely on the firstframe.pdb and the 
xtc of the trajectory? eg via
pdb2gmx -f 1mi5_A6D.md.firstframe.pdb -o 1mi5_A6D.md.firstframe.pdb.gro 
-p 1mi5_A6D.md.top
editconf -f 1mi5_A6D.md.firstframe.pdb.gro -o 
1mi5_A6D.firstframe.cube.pdb -bt cubic -d 2.0
genbox -cp 1mi5_A6D.firstframe.cube.pdb -cs spc216.gro -o 
1mi5_A6D.firstframe.water.pdb -p 1mi5_A6D.md.top
grompp -f md.mdp -c 1mi5_A6D.firstframe.water.pdb -p 1mi5_A6D.md.top -o 
1mi5_A6D.md.RECREATED.tpr
then the average difference between the xvg file based on the tpr and 
the recreated tpr is 1.4865E-05 (which is much more similar however 
still not identical)


example:
[bkn...@quovadis02 test]$ sdiff rmsd.tprRECREATED.xvg rmsd.tpr.xvg | less
# This file was created Thu Aug 12 11:59:23 2010  | # This 
file was created Thu Aug 12 11:22:06 2010
# by the following command: # by the 
following command:
# g_rms -f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.RECREATED.tpr -o rm | # g_rms 
-f 1mi5_A6D.md.xtc -s 1mi5_A6D.md.tpr -o rmsd.tpr.xvg

#   #
# g_rms is part of G R O M A C S:   # g_rms 
is part of G R O M A C S:

#   #
# GROup of MAchos and Cynical Suckers | # S  C  
A  M  O  R  G

#   #
@title "RMSD"   @
title "RMSD"
@xaxis  label "Time (ps)"   @
xaxis  label "Time (ps)"
@yaxis  label "RMSD (nm)"   @
yaxis  label "RMSD (nm)"

@TYPE xy@TYPE xy
@ subtitle "Protein after lsq fit to Protein"   @ 
subtitle "Protein after lsq fit to Protein"
  0.0000.0022253 |
0.0000.0005046
  3.0000.0981661 |
3.0000.0981387
  6.0000.1207914 |
6.0000.1207779
  9.0000.1351781 |
9.0000.1351306

...


cheers
Bernhard



--
gmx-us