Re: [gmx-users] Re: Umbrella sampling question
Hi, Blindly defining the center of mass for a group of atoms is not possible in a periodic system such as a typical simulation box. You need some clue as to which periodic copy of every atom that is to be chosen. By providing pull_pbcatom0 you tell gromacs to, for every atom in grp0, use the periodic copy closest to the atom given by pull_pbcatom0. If you have large pullgroups this is necessary to define the inter-group distance in a way that makes sense. If you get different results depending on that setting you really need to figure out which atom is a good center for your calculations. The default behavior is to use the atom whose *index* is in the center of the group. If you for example have a dimeric protein this may correspond to the C-terminus of the first chain or the N-terminus of the second one, which in turn often doesn't coincide with the geometrical center of the group. I suggest you try yet another choice of pull_pbcatom0 that is also close to the center to see if that also give rise to a different distance. As mentioned, the choice of pull_pbcatom0 should not matter as long as the choice allows to figure out how to handle the periodicity. Best, Erik 15 nov 2012 kl. 19.56 skrev Gmx QA: Hi Chris Seems my confusion was that I assumed that the distances in the profile.xvg-file should correspond to something I could measure with g_dist. Turns out it does not. Thank you for helping me sorting out this, I got it now :-) About pull_pbcatom0 though. My box is 2*1.08 nm in all directions: $ tail -n 1 conf0.gro 12.45770 12.45770 17.99590 I am still not sure what pull_pbcatom0 does. You said it should not have any effect on my results, but changing it does result in a different initial distance reported by grompp. In my initial attempts at this, I did not specify anything for pull_pbcatom0, but in the grompp output I get this Pull group natoms pbc atom distance at start reference at t=0 0 21939 10970 1 1 0 2.083 2.083 Estimate for the relative computational load of the PME mesh part: 0.10 This run will generate roughly 761 Mb of data Then, following the advice in the thread I referred to earlier, I set pull_pbcatom0 explicitly in the mdp-file to be an atom close to the COM of the Protein. Then I get from grompp Pull group natoms pbc atom distance at start reference at t=0 0 21939 7058 1 1 0 1.808 1.808 Estimate for the relative computational load of the PME mesh part: 0.10 This run will generate roughly 761 Mb of data As you can see, the initial distance is different (2.083 vs 1.808), and 1.808 is the same as the distance reported by g_dist. Do you have any comments here as to why this is? Thanks /PK What you reported is not what you did. It appears that grompp, gtraj, and g_dist report the same value. Please also report the value that you get from your pullx.xvg file that you get from mdrun, which I suspect will also be the same. The difference that you report is actually between the first FRAME of your trajectory from g_dist and the first LINE of the file from the g_wham output. I see no reason to assume that the values in the output of g_wham must be time-ordered. Also, I have never used g_wham myself (I use an external program to do wham) and so I can not say if you are using it correctly. My overall conclusion is that you need to investigate g_wham output not worry about a new run at this stage. Regarding pull_pbcatom0, there is lots of information on the mailing list about this. It is a global atom number that defines the unit cell for selection of which periodic image of each molecule will be used for the pulling. If all of your box dimensions are 2*1.08 nm, then pull_pbcatom0 will not affect your results. Chris. -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists --- Erik Marklund, PhD Dept. of Cell and Molecular Biology, Uppsala University. Husargatan 3, Box 596,75124 Uppsala, Sweden phone:+46 18 471 6688fax: +46 18 511 755 er...@xray.bmc.uu.se http://www2.icm.uu.se/molbio/elflab/index.html -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read
Re: [gmx-users] Re: Umbrella sampling question
Thanks Erik! /PK 2012/11/16 Erik Marklund er...@xray.bmc.uu.se Hi, Blindly defining the center of mass for a group of atoms is not possible in a periodic system such as a typical simulation box. You need some clue as to which periodic copy of every atom that is to be chosen. By providing pull_pbcatom0 you tell gromacs to, for every atom in grp0, use the periodic copy closest to the atom given by pull_pbcatom0. If you have large pullgroups this is necessary to define the inter-group distance in a way that makes sense. If you get different results depending on that setting you really need to figure out which atom is a good center for your calculations. The default behavior is to use the atom whose *index* is in the center of the group. If you for example have a dimeric protein this may correspond to the C-terminus of the first chain or the N-terminus of the second one, which in turn often doesn't coincide with the geometrical center of the group. I suggest you try yet another choice of pull_pbcatom0 that is also close to the center to see if that also give rise to a different distance. As mentioned, the choice of pull_pbcatom0 should not matter as long as the choice allows to figure out how to handle the periodicity. Best, Erik 15 nov 2012 kl. 19.56 skrev Gmx QA: Hi Chris Seems my confusion was that I assumed that the distances in the profile.xvg-file should correspond to something I could measure with g_dist. Turns out it does not. Thank you for helping me sorting out this, I got it now :-) About pull_pbcatom0 though. My box is 2*1.08 nm in all directions: $ tail -n 1 conf0.gro 12.45770 12.45770 17.99590 I am still not sure what pull_pbcatom0 does. You said it should not have any effect on my results, but changing it does result in a different initial distance reported by grompp. In my initial attempts at this, I did not specify anything for pull_pbcatom0, but in the grompp output I get this Pull group natoms pbc atom distance at start reference at t=0 0 21939 10970 1 1 0 2.083 2.083 Estimate for the relative computational load of the PME mesh part: 0.10 This run will generate roughly 761 Mb of data Then, following the advice in the thread I referred to earlier, I set pull_pbcatom0 explicitly in the mdp-file to be an atom close to the COM of the Protein. Then I get from grompp Pull group natoms pbc atom distance at start reference at t=0 0 21939 7058 1 1 0 1.808 1.808 Estimate for the relative computational load of the PME mesh part: 0.10 This run will generate roughly 761 Mb of data As you can see, the initial distance is different (2.083 vs 1.808), and 1.808 is the same as the distance reported by g_dist. Do you have any comments here as to why this is? Thanks /PK What you reported is not what you did. It appears that grompp, gtraj, and g_dist report the same value. Please also report the value that you get from your pullx.xvg file that you get from mdrun, which I suspect will also be the same. The difference that you report is actually between the first FRAME of your trajectory from g_dist and the first LINE of the file from the g_wham output. I see no reason to assume that the values in the output of g_wham must be time-ordered. Also, I have never used g_wham myself (I use an external program to do wham) and so I can not say if you are using it correctly. My overall conclusion is that you need to investigate g_wham output not worry about a new run at this stage. Regarding pull_pbcatom0, there is lots of information on the mailing list about this. It is a global atom number that defines the unit cell for selection of which periodic image of each molecule will be used for the pulling. If all of your box dimensions are 2*1.08 nm, then pull_pbcatom0 will not affect your results. Chris. -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists --- Erik Marklund, PhD Dept. of Cell and Molecular Biology, Uppsala University. Husargatan 3, Box 596,75124 Uppsala, Sweden phone:+46 18 471 6688fax: +46 18 511 755 er...@xray.bmc.uu.se http://www2.icm.uu.se/molbio/elflab/index.html -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe
Re: [gmx-users] Re: Umbrella sampling question
Hi Erik, Thank you for your answer. I see your point now. Went and had a look in gmx_wham.c to see how things are calculated, and that makes sense. I was looking for an easy way of relating different parts of the resulting PMF to my original starting frames, as a means to understand exactly what the PMF is telling me, and I thought that by mapping distances in each frame to entries in the profile.xvg-file, I could accomplish that. I see that's not the case. I am still learning Umbrella sampling and WHAM, and some of it I still do not get fully. For example, how best (or easiest) to relate (a) events and structures in individual frames (or in the pulling simulation) to (b) the pull force vs time plot and (c) the PMF-plot. Thanks /PK 2012/11/14 Erik Marklund er...@xray.bmc.uu.se Hi, See below. 14 nov 2012 kl. 15.06 skrev Gmx QA: Hi Chris, and thank you for your reply. I should have included my g_dist command in my first mail. Here goes: I first run trjconv to extract the individual frames from my pulling trajectory $ trjconv -f pull.xtc -s pull.tpr -o conf.gro -sep -pbc mol -ur compact Then, g_dist like so: $ g_dist -s pull.tpr -f conf0.gro -n index.ndx -o dist0.xvg where index.ndx is an index-file that contains both pull-groups (ie the Protein in one group, and another group with the single atom from the molecule I'm pulling). I've checked that is is the same as in the mdp-file for the pulling simulation. From this dist0.xvg-file I extract the z-component of the distance: $ tail -n 1 dist0.xvg | awk '{print $5}' -1.8083301 And this is also the same value I get using g_traj to manually calculate it. Now, I run g_wham with this command line: $ g_wham -it tpr-files.dat -if pullf-files.dat -o -hist -unit kCal which gives me a profile.xvg file (per the default -o flag) containing the PMF. The top-part of the profile.xvg file looks like this: # This file was created Tue Nov 6 09:56:23 2012 # by the following command: # g_wham -it tpr-files.dat -if pullf-files.dat -o -hist -unit kCal # # g_wham is part of G R O M A C S: # # Gromacs Runs On Most of All Computer Systems # @title Umbrella potential @xaxis label z @yaxis label E (kcal mol\S-1\N) @TYPE xy 1.761971e+000.00e+00 1.782558e+00-1.436057e-01 1.803145e+00-3.857955e-01 and I assumed that the distances reported here should be the same as the distances calculated by g_dist for each frame, but as you can see they are not (1.761971 vs 1.8083301) I have been trying to understand how the 1.761971 distance is calculated, but I do not get it since I use the pullf.xvg-files as input to g_wham. And even using the z-component of the distance from the pullx.xvg-file and calculating ie the average does not give me a value close to 1.761971. For the next couple of frames, there are also differences, but they are not constant. profile.xvg doesn't contain distances vs time, so 1.761971 has nothing to do with the first frame. Nor is it (very) close to the average distance, for the file is energy vs. your reaction coordinate. In your simulations 1.761971 is the lowest distance ever encountered and g_wham sets that as a reference point for the rest of the free energy profile, hence the 0 in the next column. If you inspect the stdout from g_wham you may see how this is the case. Best, Erik I should add that after having sent my first mail, I found this thread: http://gromacs.5086.n6.nabble.com/umbrella-sampling-PMF-position-discrepancy-td5000886.html which fairly well describes my problem. Following the suggestion to change (or in my case add, it was not there originally) the pull_pbcatom0 entry to the mdp-file as an atom that is close to the COM of the protein, gives me a distance of 1.809 reported from running grompp (when making the tpr for the pulling simulation). I've started another pulling run with this new tpr, but I don't like making changes I don't understand. I guess I'm not clear about what pull_pbcatom0 does, but I though the distance in this case was rather unambiguous. Thanks /PK Can you please let us know exactly how you got the two values that you find to be different (but expected to be the same)? i.e. post your full g_dist command and explain how you observed the value in the output from mdrun. One frame should be enough for now (as long as you are sure -- and can prove to us -- that it is indeed the same frame). You don't define profile.xvg and I wonder if the discrepancy might be related to your use of pull_start=yes Do you see a consistent bias between mdrun output and g_dist output if you subtract the values for different time frames? If so, is this difference the same as the distance in the initial frame (pull_start=yes)? Chris. -- original message -- I'm a new poster on the maillist, and new to
Re: [gmx-users] Re: Umbrella sampling question
Hi, See below. 14 nov 2012 kl. 15.06 skrev Gmx QA: Hi Chris, and thank you for your reply. I should have included my g_dist command in my first mail. Here goes: I first run trjconv to extract the individual frames from my pulling trajectory $ trjconv -f pull.xtc -s pull.tpr -o conf.gro -sep -pbc mol -ur compact Then, g_dist like so: $ g_dist -s pull.tpr -f conf0.gro -n index.ndx -o dist0.xvg where index.ndx is an index-file that contains both pull-groups (ie the Protein in one group, and another group with the single atom from the molecule I'm pulling). I've checked that is is the same as in the mdp-file for the pulling simulation. From this dist0.xvg-file I extract the z-component of the distance: $ tail -n 1 dist0.xvg | awk '{print $5}' -1.8083301 And this is also the same value I get using g_traj to manually calculate it. Now, I run g_wham with this command line: $ g_wham -it tpr-files.dat -if pullf-files.dat -o -hist -unit kCal which gives me a profile.xvg file (per the default -o flag) containing the PMF. The top-part of the profile.xvg file looks like this: # This file was created Tue Nov 6 09:56:23 2012 # by the following command: # g_wham -it tpr-files.dat -if pullf-files.dat -o -hist -unit kCal # # g_wham is part of G R O M A C S: # # Gromacs Runs On Most of All Computer Systems # @title Umbrella potential @xaxis label z @yaxis label E (kcal mol\S-1\N) @TYPE xy 1.761971e+000.00e+00 1.782558e+00-1.436057e-01 1.803145e+00-3.857955e-01 and I assumed that the distances reported here should be the same as the distances calculated by g_dist for each frame, but as you can see they are not (1.761971 vs 1.8083301) I have been trying to understand how the 1.761971 distance is calculated, but I do not get it since I use the pullf.xvg-files as input to g_wham. And even using the z-component of the distance from the pullx.xvg-file and calculating ie the average does not give me a value close to 1.761971. For the next couple of frames, there are also differences, but they are not constant. profile.xvg doesn't contain distances vs time, so 1.761971 has nothing to do with the first frame. Nor is it (very) close to the average distance, for the file is energy vs. your reaction coordinate. In your simulations 1.761971 is the lowest distance ever encountered and g_wham sets that as a reference point for the rest of the free energy profile, hence the 0 in the next column. If you inspect the stdout from g_wham you may see how this is the case. Best, Erik I should add that after having sent my first mail, I found this thread: http://gromacs.5086.n6.nabble.com/umbrella-sampling-PMF-position-discrepancy-td5000886.html which fairly well describes my problem. Following the suggestion to change (or in my case add, it was not there originally) the pull_pbcatom0 entry to the mdp-file as an atom that is close to the COM of the protein, gives me a distance of 1.809 reported from running grompp (when making the tpr for the pulling simulation). I've started another pulling run with this new tpr, but I don't like making changes I don't understand. I guess I'm not clear about what pull_pbcatom0 does, but I though the distance in this case was rather unambiguous. Thanks /PK Can you please let us know exactly how you got the two values that you find to be different (but expected to be the same)? i.e. post your full g_dist command and explain how you observed the value in the output from mdrun. One frame should be enough for now (as long as you are sure -- and can prove to us -- that it is indeed the same frame). You don't define profile.xvg and I wonder if the discrepancy might be related to your use of pull_start=yes Do you see a consistent bias between mdrun output and g_dist output if you subtract the values for different time frames? If so, is this difference the same as the distance in the initial frame (pull_start=yes)? Chris. -- original message -- I'm a new poster on the maillist, and new to umbrella sampling but not to MD in general. I have recently done some pulling simulation with Gromacs, but have a few questions about the outcome, and in particular the distances that are calculated for the different windows. I realize this is a question that has come up before, and there are some useful posts in the archives, but nothing that exactly answers my questions. To begin, I ran a pulling-simulation pulling a molecule starting in the middle of a membrane (and inside a protein transport channel) straight up in z. To that end, I used the following mdp-settings: pull = umbrella pull_geometry = direction pull_vec1 = 0 0 1 pull_ngroups = 1 pull_start = yes pull_group0 = Protein pull_group1 = pg1 pull_rate1 = 0.001 pull_k1 = 500 So my pull groups are the COM of the protein, and one atom named pg1 on the pull molecule. To my understanding, this should