Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
Hi, This all sounds super interesting. However, is there anything I can do for now or do I need to just find the best combination by hand? Thank you very much, Max -Original Message- From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:gromacs.org_gmx-users-boun...@maillist.sys.kth.se] On Behalf Of Carsten Kutzner Sent: Montag, 29. September 2014 19:23 To: gmx-us...@gromacs.org Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi On 29 Sep 2014, at 18:40, Mark Abraham mark.j.abra...@gmail.com wrote: Hi, That seems suitable. Oh, it just occurred to me that on systems that use the Load Leveler, we have no means of specifying the number of MPI ranks on the command line, since 'poe' has no switch for that. So at least for this case I guess we also need to make the test optional. Carsten Mark On Mon, Sep 29, 2014 at 6:32 PM, Carsten Kutzner ckut...@gwdg.de wrote: Hi, On 29 Sep 2014, at 18:17, Mark Abraham mark.j.abra...@gmail.com wrote: Hi, It can't be fixed, because there is no surefire way to run an arbitrary tpr on arbitrary number of ranks, regardless of how you guess -npme might succeed. What about making this check on two ranks always, regardless of what was specified on the g_tune_pme command line? On two ranks, we will never have separate PME ranks, so it should always work, since we end up with two ranks, doing PP and then PME. If the system is so small that you can not decompose it in two DD domains, there is no use to do tuning anyway. So even if you say g_tune_pme -np 48 -s input.tpr we first check with mpirun -np 2 mdrun -s input.tpr and only after that continue with -np 48. Carsten We should just make the check optional, instead of being a deal breaker. Mark On Sep 29, 2014 4:35 PM, Carsten Kutzner ckut...@gwdg.de wrote: Hi, I see where the problem is. There is an initial check in g_tune_pme to make sure that parallel runs can be executed at all. This is being run with the automatic number of PME-only ranks, which is 11 for your input file. Unfortunately, this results in 37 PP ranks, for which no domain decomposition can be found. At some point in the past we discussed that this could happen and it should be fixed. Will open a bug entry. Thanks, Carsten On 29 Sep 2014, at 15:36, Ebert Maximilian m.eb...@umontreal.ca wrote: Hi, this ist he command: setenv MDRUN mdrun_mpi g_tune_pme_mpi -np 48 -s ../eq_nvt/1ZG4_nvt.tpr -launch Here the output of perf.out P E R F O R M A N C E R E S U L T S g_tune_pme_mpi for Gromacs VERSION 5.0.1 Number of ranks : 48 The mpirun command is : mpirun Passing # of ranks via : -np The mdrun command is : mdrun_mpi mdrun args benchmarks : -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g bench.log Benchmark steps : 1000 dlb equilibration steps : 100 mdrun args at launchtime: Repeats for each test : 2 Input file : ../eq_nvt/1ZG4_nvt.tpr PME/PP load estimate : 0.151964 Number of particles : 39489 Coulomb type : PME Grid spacing x y z : 0.114561 0.114561 0.114561 Van der Waals type : Cut-off Will try these real/reciprocal workload settings: No. scaling rcoulomb nkx nky nkz spacing rvdw tpr file 0 1.00 1.20 72 72 72 0.12 1.20 ../eq_nvt/1ZG4_nvt_bench00.tpr 1 1.10 1.32 64 64 64 0.132000 1.32 ../eq_nvt/1ZG4_nvt_bench01.tpr 2 1.20 1.44 60 60 60 0.144000 1.44 ../eq_nvt/1ZG4_nvt_bench02.tpr Note that in addition to the Coulomb radius and the Fourier grid other input settings were also changed (see table above). Please check if the modified settings are appropriate. Individual timings for input file 0 (../eq_nvt/1ZG4_nvt_bench00.tpr): PME ranks Gcycles ns/dayPME/fRemark Cannot run the benchmark simulations! Please check the error message of mdrun for the source of the problem. Did you provide a command line argument that neither g_tune_pme nor mdrun understands? Offending command: mpirun -np 48 mdrun_mpi -npme 11 -s ../eq_nvt/1ZG4_nvt_bench00.tpr -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g bench.log -nsteps 1 -quiet and here are parts of the bench.log: Log file opened on Mon Sep 29 08:56:38 2014 Host: node-e1-67 pid: 24470 rank ID: 0 number of ranks: 48 GROMACS:gmx mdrun, VERSION 5.0.1 GROMACS is written by: Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar Aldert van Buuren Rudi van DrunenAnton Feenstra Sebastian Fritsch Gerrit GroenhofChristoph Junghans Peter Kasson Carsten
Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
Hi Max, On 30 Sep 2014, at 09:40, Ebert Maximilian m.eb...@umontreal.ca wrote: Hi, This all sounds super interesting. However, is there anything I can do for now or do I need to just find the best combination by hand? For now you can add the “-npme all -min 0.25 -max 0.33” options to g_tune_pme to make it work. Note that the values are just examples (further explanation in g_tune_pme -h). The idea here is that by choosing a -max value other than 0.5 (the default) you make the initial check work, then the rest will also run through. Maybe you have to play around a bit with the exact value (slightly) until you end up with something else than 37 PP ranks at the beginning. Carsten Thank you very much, Max -Original Message- From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:gromacs.org_gmx-users-boun...@maillist.sys.kth.se] On Behalf Of Carsten Kutzner Sent: Montag, 29. September 2014 19:23 To: gmx-us...@gromacs.org Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi On 29 Sep 2014, at 18:40, Mark Abraham mark.j.abra...@gmail.com wrote: Hi, That seems suitable. Oh, it just occurred to me that on systems that use the Load Leveler, we have no means of specifying the number of MPI ranks on the command line, since 'poe' has no switch for that. So at least for this case I guess we also need to make the test optional. Carsten Mark On Mon, Sep 29, 2014 at 6:32 PM, Carsten Kutzner ckut...@gwdg.de wrote: Hi, On 29 Sep 2014, at 18:17, Mark Abraham mark.j.abra...@gmail.com wrote: Hi, It can't be fixed, because there is no surefire way to run an arbitrary tpr on arbitrary number of ranks, regardless of how you guess -npme might succeed. What about making this check on two ranks always, regardless of what was specified on the g_tune_pme command line? On two ranks, we will never have separate PME ranks, so it should always work, since we end up with two ranks, doing PP and then PME. If the system is so small that you can not decompose it in two DD domains, there is no use to do tuning anyway. So even if you say g_tune_pme -np 48 -s input.tpr we first check with mpirun -np 2 mdrun -s input.tpr and only after that continue with -np 48. Carsten We should just make the check optional, instead of being a deal breaker. Mark On Sep 29, 2014 4:35 PM, Carsten Kutzner ckut...@gwdg.de wrote: Hi, I see where the problem is. There is an initial check in g_tune_pme to make sure that parallel runs can be executed at all. This is being run with the automatic number of PME-only ranks, which is 11 for your input file. Unfortunately, this results in 37 PP ranks, for which no domain decomposition can be found. At some point in the past we discussed that this could happen and it should be fixed. Will open a bug entry. Thanks, Carsten On 29 Sep 2014, at 15:36, Ebert Maximilian m.eb...@umontreal.ca wrote: Hi, this ist he command: setenv MDRUN mdrun_mpi g_tune_pme_mpi -np 48 -s ../eq_nvt/1ZG4_nvt.tpr -launch Here the output of perf.out P E R F O R M A N C E R E S U L T S g_tune_pme_mpi for Gromacs VERSION 5.0.1 Number of ranks : 48 The mpirun command is : mpirun Passing # of ranks via : -np The mdrun command is : mdrun_mpi mdrun args benchmarks : -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g bench.log Benchmark steps : 1000 dlb equilibration steps : 100 mdrun args at launchtime: Repeats for each test : 2 Input file : ../eq_nvt/1ZG4_nvt.tpr PME/PP load estimate : 0.151964 Number of particles : 39489 Coulomb type : PME Grid spacing x y z : 0.114561 0.114561 0.114561 Van der Waals type : Cut-off Will try these real/reciprocal workload settings: No. scaling rcoulomb nkx nky nkz spacing rvdw tpr file 0 1.00 1.20 72 72 72 0.12 1.20 ../eq_nvt/1ZG4_nvt_bench00.tpr 1 1.10 1.32 64 64 64 0.132000 1.32 ../eq_nvt/1ZG4_nvt_bench01.tpr 2 1.20 1.44 60 60 60 0.144000 1.44 ../eq_nvt/1ZG4_nvt_bench02.tpr Note that in addition to the Coulomb radius and the Fourier grid other input settings were also changed (see table above). Please check if the modified settings are appropriate. Individual timings for input file 0 (../eq_nvt/1ZG4_nvt_bench00.tpr): PME ranks Gcycles ns/dayPME/fRemark Cannot run the benchmark simulations! Please check the error message of mdrun for the source of the problem. Did you provide a command line argument that neither g_tune_pme nor mdrun understands? Offending command: mpirun -np 48 mdrun_mpi -npme 11 -s
Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
Hi, is this the only output? Don’t you get a perf.out file that lists which settings are optimal? What exactly was the command line you used? Carsten On 29 Sep 2014, at 15:01, Ebert Maximilian m.eb...@umontreal.ca wrote: Hi, I just tried that and I got the following error message (bench.log). Any idea what could be wrong? Thank you very much, Max Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452 multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452 Minimum cell size due to bonded interactions: 0.464 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm Estimated maximum distance required for P-LINCS: 0.218 nm --- Program mdrun_mpi, VERSION 5.0.1 Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_setup.c, line: 728 Fatal error: The number of ranks you selected (37) contains a large prime factor 37. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors --- -Original Message- From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:gromacs.org_gmx-users-boun...@maillist.sys.kth.se] On Behalf Of Carsten Kutzner Sent: Donnerstag, 25. September 2014 19:29 To: gmx-us...@gromacs.org Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi Hi, don't invoke g_tune_pme with 'mpirun', because it is a serial executable that itself invokes parallel MD runs for testing. use export MDRUN=mdrun_mpi g_tune_pme -np 24 -s 1ZG4_nvt.tpr -launch see also g_tune_pme -h You may need to recompile g_tune_pme without MPI enabled (depends on your queueing system) Best, Carsten On 25 Sep 2014, at 15:10, Ebert Maximilian m.eb...@umontreal.ca wrote: Dear list, I tried using g_tune_pme_mpi with the command: mpirun -np 24 g_tune_pme_mpi -np 24 -s 1ZG4_nvt.tpr -launch on GROMACS 5.0.1 but I get the following error message: -- mpirun was unable to launch the specified application as it could not find an executable: Executable: mdrun Node: while attempting to start process rank 0. -- 24 total processes failed to start Any idea why this is? Shouldn't g_tune_pme_mpi call mdrun_mpi instead? Thank you very much, Max -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Dr. Carsten Kutzner Max Planck Institute for Biophysical Chemistry Theoretical and Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany Tel. +49-551-2012313, Fax: +49-551-2012302 http://www.mpibpc.mpg.de/grubmueller/kutzner http://www.mpibpc.mpg.de/grubmueller/sppexa -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Dr. Carsten Kutzner Max Planck Institute for Biophysical Chemistry Theoretical and Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany Tel. +49-551-2012313, Fax: +49-551-2012302 http://www.mpibpc.mpg.de/grubmueller/kutzner http://www.mpibpc.mpg.de/grubmueller/sppexa -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
:Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Build CPU family: 6 Model: 44 Stepping: 2 Build CPU features: aes apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 C compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/gcc GNU 4.8.1 C compiler flags:-msse4.1 -Wno-maybe-uninitialized -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG C++ compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/g++ GNU 4.8.1 C++ compiler flags: -msse4.1 -std=c++0x -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG Boost version: 1.55.0 (internal) n = 0 E-zt: n = 0 swapcoords = no adress = FALSE userint1 = 0 userint2 = 0 userint3 = 0 userint4 = 0 userreal1 = 0 userreal2 = 0 userreal3 = 0 userreal4 = 0 grpopts: nrdf: 10175.6 70836.4 ref-t: 304.65 304.65 tau-t: 0.5 0.5 annealing: Single Single annealing-npoints: 4 4 annealing-time [0]: 0.0 200.0 300.0 750.0 annealing-temp [0]: 10.0 100.0 100.0 304.6 annealing-time [1]: 0.0 200.0 300.0 750.0 annealing-temp [1]: 10.0 100.0 100.0 304.6 acc:0 0 0 nfreeze: N N N energygrp-flags[ 0]: 0 Overriding nsteps with value passed on the command line: 1 steps, 0.002 ps Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452 multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452 Minimum cell size due to bonded interactions: 0.464 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm Estimated maximum distance required for P-LINCS: 0.218 nm --- Program mdrun_mpi, VERSION 5.0.1 Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_setup.c, line: 728 Fatal error: The number of ranks you selected (37) contains a large prime factor 37. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors --- -Original Message- From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:gromacs.org_gmx-users-boun...@maillist.sys.kth.se] On Behalf Of Carsten Kutzner Sent: Montag, 29. September 2014 15:23 To: gmx-us...@gromacs.org Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi Hi, is this the only output? Don't you get a perf.out file that lists which settings are optimal? What exactly was the command line you used? Carsten On 29 Sep 2014, at 15:01, Ebert Maximilian m.eb...@umontreal.ca wrote: Hi, I just tried that and I got the following error message (bench.log). Any idea what could be wrong? Thank you very much, Max Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452 multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452 Minimum cell size due to bonded interactions: 0.464 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm Estimated maximum distance required for P-LINCS: 0.218 nm --- Program mdrun_mpi, VERSION 5.0.1 Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_se tup.c, line: 728 Fatal error: The number of ranks you selected (37) contains a large prime factor 37. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http
Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
instructions: SSE4.1 FFT library:fftw-3.3.3-sse2 RDTSCP usage: enabled C++11 compilation: enabled TNG support:enabled Tracing support:disabled Built on: Tue Sep 23 09:58:07 EDT 2014 Built by: rqchpbib@briaree1 [CMAKE] Build OS/arch: Linux 2.6.32-71.el6.x86_64 x86_64 Build CPU vendor: GenuineIntel Build CPU brand:Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Build CPU family: 6 Model: 44 Stepping: 2 Build CPU features: aes apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 C compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/gcc GNU 4.8.1 C compiler flags:-msse4.1 -Wno-maybe-uninitialized -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG C++ compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/g++ GNU 4.8.1 C++ compiler flags: -msse4.1 -std=c++0x -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG Boost version: 1.55.0 (internal) n = 0 E-zt: n = 0 swapcoords = no adress = FALSE userint1 = 0 userint2 = 0 userint3 = 0 userint4 = 0 userreal1 = 0 userreal2 = 0 userreal3 = 0 userreal4 = 0 grpopts: nrdf: 10175.6 70836.4 ref-t: 304.65 304.65 tau-t: 0.5 0.5 annealing: Single Single annealing-npoints: 4 4 annealing-time [0]: 0.0 200.0 300.0 750.0 annealing-temp [0]: 10.0 100.0 100.0 304.6 annealing-time [1]: 0.0 200.0 300.0 750.0 annealing-temp [1]: 10.0 100.0 100.0 304.6 acc:0 0 0 nfreeze: N N N energygrp-flags[ 0]: 0 Overriding nsteps with value passed on the command line: 1 steps, 0.002 ps Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452 multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452 Minimum cell size due to bonded interactions: 0.464 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm Estimated maximum distance required for P-LINCS: 0.218 nm --- Program mdrun_mpi, VERSION 5.0.1 Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_setup.c, line: 728 Fatal error: The number of ranks you selected (37) contains a large prime factor 37. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors --- -Original Message- From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto: gromacs.org_gmx-users-boun...@maillist.sys.kth.se] On Behalf Of Carsten Kutzner Sent: Montag, 29. September 2014 15:23 To: gmx-us...@gromacs.org Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi Hi, is this the only output? Don't you get a perf.out file that lists which settings are optimal? What exactly was the command line you used? Carsten On 29 Sep 2014, at 15:01, Ebert Maximilian m.eb...@umontreal.ca wrote: Hi, I just tried that and I got the following error message (bench.log). Any idea what could be wrong? Thank you very much, Max Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452 multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452 Minimum cell size due to bonded interactions: 0.464 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm Estimated maximum distance required for P-LINCS: 0.218 nm --- Program mdrun_mpi, VERSION
Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx mdrun, VERSION 5.0.1 Executable: /home/apps/Logiciels/gromacs/gromacs-5.0.1/bin/gmx_mpi Library dir: /home/apps/Logiciels/gromacs/gromacs-5.0.1/share/gromacs/top Command line: mdrun_mpi -npme 11 -s ../eq_nvt/1ZG4_nvt_bench00.tpr -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g bench.log -nsteps 1 -quiet Gromacs version:VERSION 5.0.1 Precision: single Memory model: 64 bit MPI library:MPI OpenMP support: enabled GPU support:disabled invsqrt routine:gmx_software_invsqrt(x) SIMD instructions: SSE4.1 FFT library:fftw-3.3.3-sse2 RDTSCP usage: enabled C++11 compilation: enabled TNG support:enabled Tracing support:disabled Built on: Tue Sep 23 09:58:07 EDT 2014 Built by: rqchpbib@briaree1 [CMAKE] Build OS/arch: Linux 2.6.32-71.el6.x86_64 x86_64 Build CPU vendor: GenuineIntel Build CPU brand:Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Build CPU family: 6 Model: 44 Stepping: 2 Build CPU features: aes apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 C compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/gcc GNU 4.8.1 C compiler flags:-msse4.1 -Wno-maybe-uninitialized -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG C++ compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/g++ GNU 4.8.1 C++ compiler flags: -msse4.1 -std=c++0x -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG Boost version: 1.55.0 (internal) n = 0 E-zt: n = 0 swapcoords = no adress = FALSE userint1 = 0 userint2 = 0 userint3 = 0 userint4 = 0 userreal1 = 0 userreal2 = 0 userreal3 = 0 userreal4 = 0 grpopts: nrdf: 10175.6 70836.4 ref-t: 304.65 304.65 tau-t: 0.5 0.5 annealing: Single Single annealing-npoints: 4 4 annealing-time [0]: 0.0 200.0 300.0 750.0 annealing-temp [0]: 10.0 100.0 100.0 304.6 annealing-time [1]: 0.0 200.0 300.0 750.0 annealing-temp [1]: 10.0 100.0 100.0 304.6 acc:0 0 0 nfreeze: N N N energygrp-flags[ 0]: 0 Overriding nsteps with value passed on the command line: 1 steps, 0.002 ps Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452 multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452 Minimum cell size due to bonded interactions: 0.464 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm Estimated maximum distance required for P-LINCS: 0.218 nm --- Program mdrun_mpi, VERSION 5.0.1 Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_setup.c, line: 728 Fatal error: The number of ranks you selected (37) contains a large prime factor 37. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors --- -Original Message- From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto: gromacs.org_gmx-users-boun...@maillist.sys.kth.se] On Behalf Of Carsten Kutzner Sent: Montag, 29. September 2014 15:23 To: gmx-us...@gromacs.org Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi Hi, is this the only output? Don't you get a perf.out file that lists which settings are optimal? What exactly was the command line you used? Carsten On 29 Sep 2014, at 15:01, Ebert Maximilian m.eb...@umontreal.ca wrote: Hi, I just
Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
/Logiciels/gromacs/gromacs-5.0.1/bin/gmx_mpi Library dir: /home/apps/Logiciels/gromacs/gromacs-5.0.1/share/gromacs/top Command line: mdrun_mpi -npme 11 -s ../eq_nvt/1ZG4_nvt_bench00.tpr -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g bench.log -nsteps 1 -quiet Gromacs version:VERSION 5.0.1 Precision: single Memory model: 64 bit MPI library:MPI OpenMP support: enabled GPU support:disabled invsqrt routine:gmx_software_invsqrt(x) SIMD instructions: SSE4.1 FFT library:fftw-3.3.3-sse2 RDTSCP usage: enabled C++11 compilation: enabled TNG support:enabled Tracing support:disabled Built on: Tue Sep 23 09:58:07 EDT 2014 Built by: rqchpbib@briaree1 [CMAKE] Build OS/arch: Linux 2.6.32-71.el6.x86_64 x86_64 Build CPU vendor: GenuineIntel Build CPU brand:Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Build CPU family: 6 Model: 44 Stepping: 2 Build CPU features: aes apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 C compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/gcc GNU 4.8.1 C compiler flags:-msse4.1 -Wno-maybe-uninitialized -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG C++ compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/g++ GNU 4.8.1 C++ compiler flags: -msse4.1 -std=c++0x -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG Boost version: 1.55.0 (internal) n = 0 E-zt: n = 0 swapcoords = no adress = FALSE userint1 = 0 userint2 = 0 userint3 = 0 userint4 = 0 userreal1 = 0 userreal2 = 0 userreal3 = 0 userreal4 = 0 grpopts: nrdf: 10175.6 70836.4 ref-t: 304.65 304.65 tau-t: 0.5 0.5 annealing: Single Single annealing-npoints: 4 4 annealing-time [0]: 0.0 200.0 300.0 750.0 annealing-temp [0]: 10.0 100.0 100.0 304.6 annealing-time [1]: 0.0 200.0 300.0 750.0 annealing-temp [1]: 10.0 100.0 100.0 304.6 acc:0 0 0 nfreeze: N N N energygrp-flags[ 0]: 0 Overriding nsteps with value passed on the command line: 1 steps, 0.002 ps Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452 multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452 Minimum cell size due to bonded interactions: 0.464 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm Estimated maximum distance required for P-LINCS: 0.218 nm --- Program mdrun_mpi, VERSION 5.0.1 Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_setup.c, line: 728 Fatal error: The number of ranks you selected (37) contains a large prime factor 37. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors --- -Original Message- From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto: gromacs.org_gmx-users-boun...@maillist.sys.kth.se] On Behalf Of Carsten Kutzner Sent: Montag, 29. September 2014 15:23 To: gmx-us...@gromacs.org Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi Hi, is this the only output? Don't you get a perf.out file that lists which settings are optimal? What exactly was the command line you used? Carsten On 29 Sep 2014, at 15:01, Ebert Maximilian m.eb...@umontreal.ca wrote: Hi, I just tried that and I got the following error message (bench.log). Any idea what could be wrong? Thank you very much, Max Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances
Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
at Uppsala University, Stockholm University and the Royal Institute of Technology, Sweden. check out http://www.gromacs.org for more information. GROMACS is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx mdrun, VERSION 5.0.1 Executable: /home/apps/Logiciels/gromacs/gromacs-5.0.1/bin/gmx_mpi Library dir: /home/apps/Logiciels/gromacs/gromacs-5.0.1/share/gromacs/top Command line: mdrun_mpi -npme 11 -s ../eq_nvt/1ZG4_nvt_bench00.tpr -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g bench.log -nsteps 1 -quiet Gromacs version:VERSION 5.0.1 Precision: single Memory model: 64 bit MPI library:MPI OpenMP support: enabled GPU support:disabled invsqrt routine:gmx_software_invsqrt(x) SIMD instructions: SSE4.1 FFT library:fftw-3.3.3-sse2 RDTSCP usage: enabled C++11 compilation: enabled TNG support:enabled Tracing support:disabled Built on: Tue Sep 23 09:58:07 EDT 2014 Built by: rqchpbib@briaree1 [CMAKE] Build OS/arch: Linux 2.6.32-71.el6.x86_64 x86_64 Build CPU vendor: GenuineIntel Build CPU brand:Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Build CPU family: 6 Model: 44 Stepping: 2 Build CPU features: aes apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 C compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/gcc GNU 4.8.1 C compiler flags:-msse4.1 -Wno-maybe-uninitialized -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG C++ compiler: /RQusagers/apps/Logiciels/gcc/4.8.1/bin/g++ GNU 4.8.1 C++ compiler flags: -msse4.1 -std=c++0x -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds -O3 -DNDEBUG Boost version: 1.55.0 (internal) n = 0 E-zt: n = 0 swapcoords = no adress = FALSE userint1 = 0 userint2 = 0 userint3 = 0 userint4 = 0 userreal1 = 0 userreal2 = 0 userreal3 = 0 userreal4 = 0 grpopts: nrdf: 10175.6 70836.4 ref-t: 304.65 304.65 tau-t: 0.5 0.5 annealing: Single Single annealing-npoints: 4 4 annealing-time [0]: 0.0 200.0 300.0 750.0 annealing-temp [0]: 10.0 100.0 100.0 304.6 annealing-time [1]: 0.0 200.0 300.0 750.0 annealing-temp [1]: 10.0 100.0 100.0 304.6 acc:0 0 0 nfreeze: N N N energygrp-flags[ 0]: 0 Overriding nsteps with value passed on the command line: 1 steps, 0.002 ps Initializing Domain Decomposition on 48 ranks Dynamic load balancing: auto Will sort the charge groups at every domain (re)decomposition Initial maximum inter charge-group distances: two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452 multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452 Minimum cell size due to bonded interactions: 0.464 nm Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm Estimated maximum distance required for P-LINCS: 0.218 nm --- Program mdrun_mpi, VERSION 5.0.1 Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_setup.c, line: 728 Fatal error: The number of ranks you selected (37) contains a large prime factor 37. In most cases this will lead to bad performance. Choose a number with smaller prime factors or set the decomposition (option -dd) manually. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors --- -Original Message- From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto: gromacs.org_gmx-users-boun...@maillist.sys.kth.se] On Behalf Of Carsten Kutzner Sent: Montag, 29. September 2014 15:23 To: gmx-us...@gromacs.org Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi Hi, is this the only output? Don't you get a perf.out file that lists which settings are optimal? What exactly
Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
Hi, don’t invoke g_tune_pme with ‘mpirun’, because it is a serial executable that itself invokes parallel MD runs for testing. use export MDRUN=mdrun_mpi g_tune_pme -np 24 -s 1ZG4_nvt.tpr -launch see also g_tune_pme -h You may need to recompile g_tune_pme without MPI enabled (depends on your queueing system) Best, Carsten On 25 Sep 2014, at 15:10, Ebert Maximilian m.eb...@umontreal.ca wrote: Dear list, I tried using g_tune_pme_mpi with the command: mpirun -np 24 g_tune_pme_mpi -np 24 -s 1ZG4_nvt.tpr -launch on GROMACS 5.0.1 but I get the following error message: -- mpirun was unable to launch the specified application as it could not find an executable: Executable: mdrun Node: while attempting to start process rank 0. -- 24 total processes failed to start Any idea why this is? Shouldn't g_tune_pme_mpi call mdrun_mpi instead? Thank you very much, Max -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Dr. Carsten Kutzner Max Planck Institute for Biophysical Chemistry Theoretical and Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany Tel. +49-551-2012313, Fax: +49-551-2012302 http://www.mpibpc.mpg.de/grubmueller/kutzner http://www.mpibpc.mpg.de/grubmueller/sppexa -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.