[gmx-users] different results when using different number cpus
Dear all, I used Gromacs3.3.1 to do a simulation about two proteins in water(tip3p). I run two similar simulations, one for 2 cpus, while the other for 16 cpus. The two simulations have the same .gro, .top, and the same .mdp files. I found the results were not the same. In the 2 cpus simulation, the two proteins run closer and closer. But they run apart in the 16 cpus simulation. Is that normal the different results when using different number cpus? The size of my simulation box is 9*7*7. Best regards, 2007-12-5 = Dechang Li, PhD Candidate Department of Engineering Mechanics Tsinghua University Beijing 100084 PR China Tel: +86-10-62773779(O) Email: [EMAIL PROTECTED] = ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php
Re: [gmx-users] different results when using different number cpus
Hi, With Gromacs and (nearly) all other MD packages you will never be able to get binary identical results when running on different number of CPUs. Since MD is chaotic, the results can be very different. Berk. From: Carsten Kutzner [EMAIL PROTECTED] Reply-To: Discussion list for GROMACS users gmx-users@gromacs.org To: Discussion list for GROMACS users gmx-users@gromacs.org Subject: Re: [gmx-users] different results when using different number cpus Date: Wed, 05 Dec 2007 14:10:06 +0100 Hi Dechang, it is normal that results are not binary identical if you compare the same MD system on different numbers of processors. If you use PME then you will probably get slightly different charge grids for 2 and for 16 processors - since the charge grid has to be divisible by the number of CPUs in x- and y-direction. Even if you manually set the grid dimensions to be the same for both cases, your simulations could diverge when using version 3.x of the FFTW. This version has a build-in timer and chooses the fastest of several algorithms which could be another even in two runs on the same number of processors - depending on the timing results. With different algorithms you get slight differences in the last digit of the computed numbers (rounding / truncation / order of evaluation) which will then grow during the simulation and lead to diverging trajectories. Of course the averaged properties of the simulation are unaffected by those differences and should be the same if averaged long enough. You could use FFTW 2.x and manually set the FFT grid size to the same value for the 2 and 16 CPU case - but I am not shure if this is enough to get binary identical results. You could also repeat your simulations several times with (slightly) different starting conditions (maybe different starting velocities) to get a better picture of the average behaviour of your system. If in all 16 processor cases you see the proteins diverge and in all 2 processor cases you see them converge, I would guess something is wrong. Hope that helps, Carsten Dechang Li wrote: Dear all, I used Gromacs3.3.1 to do a simulation about two proteins in water(tip3p). I run two similar simulations, one for 2 cpus, while the other for 16 cpus. The two simulations have the same .gro, .top, and the same .mdp files. I found the results were not the same. In the 2 cpus simulation, the two proteins run closer and closer. But they run apart in the 16 cpus simulation. Is that normal the different results when using different number cpus? The size of my simulation box is 9*7*7. Best regards, 2007-12-5 = Dechang Li, PhD Candidate Department of Engineering Mechanics Tsinghua University Beijing 100084 PR China Tel: +86-10-62773779(O) Email: [EMAIL PROTECTED] =¡¡ ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php -- Dr. Carsten Kutzner Max Planck Institute for Biophysical Chemistry Theoretical and Computational Biophysics Department Am Fassberg 11 37077 Goettingen, Germany Tel. +49-551-2012313, Fax: +49-551-2012302 http://www.mpibpc.mpg.de/research/dep/grubmueller/ http://www.gwdg.de/~ckutzne ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php _ Play online games with your friends with Messenger http://www.join.msn.com/messenger/overview ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php
[gmx-users] different results when using different number cpus
Message: 4 Date: Wed, 05 Dec 2007 14:19:28 +0100 From: Berk Hess [EMAIL PROTECTED] Subject: Re: [gmx-users] different results when using different number cpus To: gmx-users@gromacs.org Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; format=flowed Hi, With Gromacs and (nearly) all other MD packages you will never be able to get binary identical results when running on different number of CPUs. Since MD is chaotic, the results can be very different. Berk. I can confirm that I get the same thing when running a repeat of a simulation segment twice on 4 cpus with gromacs-3.3.1 and fftw-3.1.2. Further, while trying to debug a collegues parameters that give a lincs error after long periods of simulation time on a single processor I find that a proper restart from just prior to the crash does not lead to an exact repeat of the error (although an error does eventually occur). This was unfortunate since my plan was to save the .trr every 100ps and then do a restart in which I saved the .xtc every integration step to get a good look at the problem. Carsten's comments about fftw3.x is useful since I have been using fftw-3.1.2. Note that I did not test to see if a run on 1cpu will generate an identical trajectory, only that the lincs error is not exactly reproduced. I did the restart using .trr/.edr and set gen_vel=no;unconstrained_start=yes; for the restart. I agree that statistical properties will be properly reproduced, but I can imagine situations in which a proper restart would be identical: e.g. an interest in the dynamics of quick rare processes in which one might run for a long time while saving .xtc and .trr infrequently and then restarting at the proper place while saving .xtc very frequently in order to capture the dynamics of an identified transition. From: Carsten Kutzner [EMAIL PROTECTED] Reply-To: Discussion list for GROMACS users gmx-users@gromacs.org To: Discussion list for GROMACS users gmx-users@gromacs.org Subject: Re: [gmx-users] different results when using different number cpus Date: Wed, 05 Dec 2007 14:10:06 +0100 Hi Dechang, it is normal that results are not binary identical if you compare the same MD system on different numbers of processors. If you use PME then you will probably get slightly different charge grids for 2 and for 16 processors - since the charge grid has to be divisible by the number of CPUs in x- and y-direction. Even if you manually set the grid dimensions to be the same for both cases, your simulations could diverge when using version 3.x of the FFTW. This version has a build-in timer and chooses the fastest of several algorithms which could be another even in two runs on the same number of processors - depending on the timing results. With different algorithms you get slight differences in the last digit of the computed numbers (rounding / truncation / order of evaluation) which will then grow during the simulation and lead to diverging trajectories. Of course the averaged properties of the simulation are unaffected by those differences and should be the same if averaged long enough. You could use FFTW 2.x and manually set the FFT grid size to the same value for the 2 and 16 CPU case - but I am not shure if this is enough to get binary identical results. You could also repeat your simulations several times with (slightly) different starting conditions (maybe different starting velocities) to get a better picture of the average behaviour of your system. If in all 16 processor cases you see the proteins diverge and in all 2 processor cases you see them converge, I would guess something is wrong. Hope that helps, Carsten Dechang Li wrote: Dear all, I used Gromacs3.3.1 to do a simulation about two proteins in water(tip3p). I run two similar simulations, one for 2 cpus, while the other for 16 cpus. The two simulations have the same .gro, .top, and the same .mdp files. I found the results were not the same. In the 2 cpus simulation, the two proteins run closer and closer. But they run apart in the 16 cpus simulation. Is that normal the different results when using different number cpus? The size of my simulation box is 9*7*7. Best regards, 2007-12-5 = Dechang Li, PhD Candidate Department of Engineering Mechanics Tsinghua University Beijing 100084 PR China Tel: +86-10-62773779(O) Email: [EMAIL PROTECTED] =¡¡ ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php
Re: [gmx-users] different results when using different number cpus
Hi Dechang, it is normal that results are not binary identical if you compare the same MD system on different numbers of processors. If you use PME then you will probably get slightly different charge grids for 2 and for 16 processors - since the charge grid has to be divisible by the number of CPUs in x- and y-direction. Even if you manually set the grid dimensions to be the same for both cases, your simulations could diverge when using version 3.x of the FFTW. This version has a build-in timer and chooses the fastest of several algorithms which could be another even in two runs on the same number of processors - depending on the timing results. With different algorithms you get slight differences in the last digit of the computed numbers (rounding / truncation / order of evaluation) which will then grow during the simulation and lead to diverging trajectories. Of course the averaged properties of the simulation are unaffected by those differences and should be the same if averaged long enough. You could use FFTW 2.x and manually set the FFT grid size to the same value for the 2 and 16 CPU case - but I am not shure if this is enough to get binary identical results. You could also repeat your simulations several times with (slightly) different starting conditions (maybe different starting velocities) to get a better picture of the average behaviour of your system. If in all 16 processor cases you see the proteins diverge and in all 2 processor cases you see them converge, I would guess something is wrong. Hope that helps, Carsten Dechang Li wrote: Dear all, I used Gromacs3.3.1 to do a simulation about two proteins in water(tip3p). I run two similar simulations, one for 2 cpus, while the other for 16 cpus. The two simulations have the same .gro, .top, and the same .mdp files. I found the results were not the same. In the 2 cpus simulation, the two proteins run closer and closer. But they run apart in the 16 cpus simulation. Is that normal the different results when using different number cpus? The size of my simulation box is 9*7*7. Best regards, 2007-12-5 = Dechang Li, PhD Candidate Department of Engineering Mechanics Tsinghua University Beijing 100084 PR China Tel: +86-10-62773779(O) Email: [EMAIL PROTECTED] = ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php -- Dr. Carsten Kutzner Max Planck Institute for Biophysical Chemistry Theoretical and Computational Biophysics Department Am Fassberg 11 37077 Goettingen, Germany Tel. +49-551-2012313, Fax: +49-551-2012302 http://www.mpibpc.mpg.de/research/dep/grubmueller/ http://www.gwdg.de/~ckutzne ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php
Re: [gmx-users] different results when using different number cpus
1. how long is the simulation? 2. did you start from equilibration (with gen_vel=yes) or production md? 3. ... On 12/5/2007 8:28 PM, Dechang Li wrote: Dear all, I used Gromacs3.3.1 to do a simulation about two proteins in water(tip3p). I run two similar simulations, one for 2 cpus, while the other for 16 cpus. The two simulations have the same .gro, .top, and the same .mdp files. I found the results were not the same. In the 2 cpus simulation, the two proteins run closer and closer. But they run apart in the 16 cpus simulation. Is that normal the different results when using different number cpus? The size of my simulation box is 9*7*7. Best regards, 2007-12-5 = Dechang Li, PhD Candidate Department of Engineering Mechanics Tsinghua University Beijing 100084 PR China Tel: +86-10-62773779(O) Email: [EMAIL PROTECTED] = ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php
Re: [gmx-users] different results when using different number cpus
Yang Ye wrote: 1. how long is the simulation? 2. did you start from equilibration (with gen_vel=yes) or production md? 3. ... On 12/5/2007 8:28 PM, Dechang Li wrote: Dear all, I used Gromacs3.3.1 to do a simulation about two proteins in water(tip3p). I run two similar simulations, one for 2 cpus, while the other for 16 cpus. The two simulations have the same .gro, .top, and the same .mdp files. I found the results were not the same. In the 2 cpus simulation, the two proteins run closer and closer. But they run apart in the 16 cpus simulation. Is that normal the different results when using different number cpus? The size of my simulation box is 9*7*7. Answer is yes. http://wiki.gromacs.org/index.php/Reproducibility Best regards, 2007-12-5 = Dechang Li, PhD Candidate Department of Engineering Mechanics Tsinghua University Beijing 100084 PR China Tel: +86-10-62773779(O) Email: [EMAIL PROTECTED] = ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php -- David van der Spoel, Ph.D. Molec. Biophys. group, Dept. of Cell Molec. Biol., Uppsala University. Box 596, 75124 Uppsala, Sweden. Phone: +46184714205. Fax: +4618511755. [EMAIL PROTECTED] [EMAIL PROTECTED] http://folding.bmc.uu.se ___ gmx-users mailing listgmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [EMAIL PROTECTED] Can't post? Read http://www.gromacs.org/mailing_lists/users.php