Re: [gmx-users] Gromacs 2018.3 Exceeding Memory Issue
Hi, A 21K atom simulation uses essentially no memory at all for actual GROMACS workload. That plus all the supporting stuff for all the related processes in a node will fit easily in 1GB of memory. Just ask for that :-) I'm not sure what units slurm is using for reporting the memory over-use. Mark On Tue, Nov 27, 2018 at 4:53 PM Justin Lemkul wrote: > > > On 11/27/18 10:43 AM, Peiyin Lee wrote: > > Hi, Mark, > > > > Thank you for all the suggestions! Regarding to the memory limit it > > should be around 118 GB. The cluster I am using has 20 cores per node > and 6 > > GB of memory space per code. That's why I think it is strange for my job > to > > exceed the large memory limit. Right now I have checked my mdp file and > > submission file and couldn't see any possible reason that causes this > large > > memory usage issue. Do you have suggestions on other places to look at? > > Probably the most applicable attribute is the number of atoms in the > system. > > -Justin > > > Thank you so much for your help. > > > > Regards, > > Peiyin > > > > On Tue, Nov 27, 2018 at 2:50 AM Mark Abraham > > wrote: > > > >> Hi, > >> > >> On Tue, Nov 27, 2018 at 4:31 AM Peiyin Lee > wrote: > >> > >>> Hi, all GROMACS users, > >>> > >>> I am trying to run jobs with Gromacs 2018.3 version and constantly > >> got a > >>> memory exceeding error. The system I ran is an all-atom system with > 21073 > >>> atoms. The largest file that is estimated to be generated is around 5.8 > >> GB. > >> Estimate sizes of disk files don't matter here. > >> > >> > >>> My jobs got constantly killed after running for only around 15 minutes > >> and > >>> got an error message like this: "slurmstepd: error: Job 12381762 > exceeded > >>> memory limit (123122052 > 12288), being killed". I have tried > using a > >>> > >> 128MB is pretty tiny these days - no compute node will have less than > 1GB > >> physical memory, so I suggest to ask for that. > >> > >> GROMACS should never leak memory as the simulation progresses - if you > >> think you are seeing that (e.g. with slightly larger memory limit, slurm > >> interrupts a bit later) then we would like to see a bug report at > >> https://redmine.gromacs.org > >> > >> > >>> larger memory specification (12GB/core) but it would take too long to > >> wait > >>> and I don't think my job really uses that many memories. I have > attached > >> my > >>> .mdp file as below: > >>> "title = NVT Production Run for Trpzip4 in pure H2O > >>> > >>> define =; position restrain the protein > >>> > >>> ; Run parameters > >>> > >>> integrator = md ; leap-frog integrator > >>> > >>> nsteps = 5000 ; 0.002 * 5 = 10 ps (100 ns) > >>> > >>> dt = 0.002 ; 2 fs > >>> > >>> ; Output control > >>> > >>> nstenergy = 1 ; save energies every 20 ps > >>> > >>> nstlog = 1 ; update log file every 20 ps > >>> > >>> nstxout-compressed = 1 ; 20ps > >>> > >>> compressed-x-precision = 200 ; 0.05 > >>> > >>> compressed-x-grps = System > >>> > >>> ; Bond parameters > >>> > >>> continuation = yes ; Restarting after NVT > >>> > >>> constraint_algorithm = lincs ; holonomic constraints > >>> > >>> constraints = all-bonds ; all bonds (even heavy atom-H bonds) > >>> constrained > >>> > >>> lincs_iter = 1 ; accuracy of LINCS > >>> > >>> lincs_order = 4 ; also related to accuracy > >>> > >>> ; Neighborsearching > >>> > >>> ns_type = grid ; search neighboring grid cels > >>> > >>> nstlist = 5 ; 10 fs > >>> > >>> rlist = 1.2 ; short-range neighborlist cutoff (in nm) > >>> > >>> rcoulomb = 1.2 ; short-range electrostatic cutoff (in nm) > >>> > >>> rvdw = 1.2 ; short-range van der Waals cutoff (in nm) > >>> > >>> ; Electrostatics > >>> > >>> coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics > >>> > >>> pme_order = 4 ; cubic interpolation > >>> > >>> fourierspacing = 0.16 ; grid spacing for FFT > >>> > >>> ; Temperature coupling is on > >>> > >>> tcoupl = V-rescale ; More accurate thermostat > >>> > >>> tc-grps = Protein SOL NA; 2 coupling groups - more accurate > >>> > >> Off topic, but it is not good practice to couple ions separately. Did > you > >> perhaps follow some tutorial that we can ask the author to fix? > >> > >> > >>> tau_t = 0.5 0.5 0.5; time constant, in ps > >>> > >>> ref_t = 400 400 400 ; reference temperature, one for each group, > in > >> K > >>> ; Pressure coupling is on > >>> > >>> pcoupl = No ; Pressure coupling on in NPT > >>> > >>> pcoupltype = isotropic ; uniform scaling of x-y box vectors, > >>> independent z > >>> > >>> tau_p = 5.0 ; time constant, in ps > >>> > >>> ref_p = 1.0 ; reference pressure, x-y, z (in bar) > >>> > >>> compressibility = 4.5e-5 ; isothermal compressibility, bar^-1 > >>> > >>> ; Periodic boundary conditions > >>> > >>> pbc = xyz ; 3-D PBC > >>> > >>> ; Dispersion correction > >>> > >>> DispCorr = EnerPres ; account for cut-off
Re: [gmx-users] Gromacs 2018.3 Exceeding Memory Issue
On 11/27/18 10:43 AM, Peiyin Lee wrote: Hi, Mark, Thank you for all the suggestions! Regarding to the memory limit it should be around 118 GB. The cluster I am using has 20 cores per node and 6 GB of memory space per code. That's why I think it is strange for my job to exceed the large memory limit. Right now I have checked my mdp file and submission file and couldn't see any possible reason that causes this large memory usage issue. Do you have suggestions on other places to look at? Probably the most applicable attribute is the number of atoms in the system. -Justin Thank you so much for your help. Regards, Peiyin On Tue, Nov 27, 2018 at 2:50 AM Mark Abraham wrote: Hi, On Tue, Nov 27, 2018 at 4:31 AM Peiyin Lee wrote: Hi, all GROMACS users, I am trying to run jobs with Gromacs 2018.3 version and constantly got a memory exceeding error. The system I ran is an all-atom system with 21073 atoms. The largest file that is estimated to be generated is around 5.8 GB. Estimate sizes of disk files don't matter here. My jobs got constantly killed after running for only around 15 minutes and got an error message like this: "slurmstepd: error: Job 12381762 exceeded memory limit (123122052 > 12288), being killed". I have tried using a 128MB is pretty tiny these days - no compute node will have less than 1GB physical memory, so I suggest to ask for that. GROMACS should never leak memory as the simulation progresses - if you think you are seeing that (e.g. with slightly larger memory limit, slurm interrupts a bit later) then we would like to see a bug report at https://redmine.gromacs.org larger memory specification (12GB/core) but it would take too long to wait and I don't think my job really uses that many memories. I have attached my .mdp file as below: "title = NVT Production Run for Trpzip4 in pure H2O define =; position restrain the protein ; Run parameters integrator = md ; leap-frog integrator nsteps = 5000 ; 0.002 * 5 = 10 ps (100 ns) dt = 0.002 ; 2 fs ; Output control nstenergy = 1 ; save energies every 20 ps nstlog = 1 ; update log file every 20 ps nstxout-compressed = 1 ; 20ps compressed-x-precision = 200 ; 0.05 compressed-x-grps = System ; Bond parameters continuation = yes ; Restarting after NVT constraint_algorithm = lincs ; holonomic constraints constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained lincs_iter = 1 ; accuracy of LINCS lincs_order = 4 ; also related to accuracy ; Neighborsearching ns_type = grid ; search neighboring grid cels nstlist = 5 ; 10 fs rlist = 1.2 ; short-range neighborlist cutoff (in nm) rcoulomb = 1.2 ; short-range electrostatic cutoff (in nm) rvdw = 1.2 ; short-range van der Waals cutoff (in nm) ; Electrostatics coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics pme_order = 4 ; cubic interpolation fourierspacing = 0.16 ; grid spacing for FFT ; Temperature coupling is on tcoupl = V-rescale ; More accurate thermostat tc-grps = Protein SOL NA; 2 coupling groups - more accurate Off topic, but it is not good practice to couple ions separately. Did you perhaps follow some tutorial that we can ask the author to fix? tau_t = 0.5 0.5 0.5; time constant, in ps ref_t = 400 400 400 ; reference temperature, one for each group, in K ; Pressure coupling is on pcoupl = No ; Pressure coupling on in NPT pcoupltype = isotropic ; uniform scaling of x-y box vectors, independent z tau_p = 5.0 ; time constant, in ps ref_p = 1.0 ; reference pressure, x-y, z (in bar) compressibility = 4.5e-5 ; isothermal compressibility, bar^-1 ; Periodic boundary conditions pbc = xyz ; 3-D PBC ; Dispersion correction DispCorr = EnerPres ; account for cut-off vdW scheme ; Velocity generation gen_vel = no ; Velocity generation is off" and the command I used to run was "mpirun -np 80 gmx_mpi mdrun -npme 16 -noappend -s md.tpr -c md.gro -e md.edr -x md.xtc -cpi md.cpt -cpo md.cpt -g md.log". Looks fine. I encourage everybody to use the default file names, and organize their projects into natural groups for the infrastructure, like directories. Renaming them doesn't add value and makes your life more complicated when you're doing restarts. Mark This is my first time posting so please excuse anything that's unclear. I will try to clarify if needed. Any help is greatly appreciated! Regards, Peiyin Lee -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at
Re: [gmx-users] Gromacs 2018.3 Exceeding Memory Issue
Hi, Mark, Thank you for all the suggestions! Regarding to the memory limit it should be around 118 GB. The cluster I am using has 20 cores per node and 6 GB of memory space per code. That's why I think it is strange for my job to exceed the large memory limit. Right now I have checked my mdp file and submission file and couldn't see any possible reason that causes this large memory usage issue. Do you have suggestions on other places to look at? Thank you so much for your help. Regards, Peiyin On Tue, Nov 27, 2018 at 2:50 AM Mark Abraham wrote: > Hi, > > On Tue, Nov 27, 2018 at 4:31 AM Peiyin Lee wrote: > > > Hi, all GROMACS users, > > > >I am trying to run jobs with Gromacs 2018.3 version and constantly > got a > > memory exceeding error. The system I ran is an all-atom system with 21073 > > atoms. The largest file that is estimated to be generated is around 5.8 > GB. > > > > Estimate sizes of disk files don't matter here. > > > > My jobs got constantly killed after running for only around 15 minutes > and > > got an error message like this: "slurmstepd: error: Job 12381762 exceeded > > memory limit (123122052 > 12288), being killed". I have tried using a > > > > 128MB is pretty tiny these days - no compute node will have less than 1GB > physical memory, so I suggest to ask for that. > > GROMACS should never leak memory as the simulation progresses - if you > think you are seeing that (e.g. with slightly larger memory limit, slurm > interrupts a bit later) then we would like to see a bug report at > https://redmine.gromacs.org > > > > larger memory specification (12GB/core) but it would take too long to > wait > > and I don't think my job really uses that many memories. I have attached > my > > .mdp file as below: > > "title = NVT Production Run for Trpzip4 in pure H2O > > > > define =; position restrain the protein > > > > ; Run parameters > > > > integrator = md ; leap-frog integrator > > > > nsteps = 5000 ; 0.002 * 5 = 10 ps (100 ns) > > > > dt = 0.002 ; 2 fs > > > > ; Output control > > > > nstenergy = 1 ; save energies every 20 ps > > > > nstlog = 1 ; update log file every 20 ps > > > > nstxout-compressed = 1 ; 20ps > > > > compressed-x-precision = 200 ; 0.05 > > > > compressed-x-grps = System > > > > ; Bond parameters > > > > continuation = yes ; Restarting after NVT > > > > constraint_algorithm = lincs ; holonomic constraints > > > > constraints = all-bonds ; all bonds (even heavy atom-H bonds) > > constrained > > > > lincs_iter = 1 ; accuracy of LINCS > > > > lincs_order = 4 ; also related to accuracy > > > > ; Neighborsearching > > > > ns_type = grid ; search neighboring grid cels > > > > nstlist = 5 ; 10 fs > > > > rlist = 1.2 ; short-range neighborlist cutoff (in nm) > > > > rcoulomb = 1.2 ; short-range electrostatic cutoff (in nm) > > > > rvdw = 1.2 ; short-range van der Waals cutoff (in nm) > > > > ; Electrostatics > > > > coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics > > > > pme_order = 4 ; cubic interpolation > > > > fourierspacing = 0.16 ; grid spacing for FFT > > > > ; Temperature coupling is on > > > > tcoupl = V-rescale ; More accurate thermostat > > > > tc-grps = Protein SOL NA; 2 coupling groups - more accurate > > > > Off topic, but it is not good practice to couple ions separately. Did you > perhaps follow some tutorial that we can ask the author to fix? > > > > tau_t = 0.5 0.5 0.5; time constant, in ps > > > > ref_t = 400 400 400 ; reference temperature, one for each group, in > K > > > > ; Pressure coupling is on > > > > pcoupl = No ; Pressure coupling on in NPT > > > > pcoupltype = isotropic ; uniform scaling of x-y box vectors, > > independent z > > > > tau_p = 5.0 ; time constant, in ps > > > > ref_p = 1.0 ; reference pressure, x-y, z (in bar) > > > > compressibility = 4.5e-5 ; isothermal compressibility, bar^-1 > > > > ; Periodic boundary conditions > > > > pbc = xyz ; 3-D PBC > > > > ; Dispersion correction > > > > DispCorr = EnerPres ; account for cut-off vdW scheme > > > > ; Velocity generation > > gen_vel = no ; Velocity generation is off" > > and the command I used to run was "mpirun -np 80 gmx_mpi mdrun -npme 16 > > -noappend -s md.tpr -c md.gro -e md.edr -x md.xtc -cpi md.cpt -cpo md.cpt > > -g md.log". > > > Looks fine. I encourage everybody to use the default file names, and > organize their projects into natural groups for the infrastructure, like > directories. Renaming them doesn't add value and makes your life more > complicated when you're doing restarts. > > Mark > > > > This is my first time posting so please excuse anything that's > > unclear. I will try to clarify if needed. Any help is greatly > appreciated! > > > > Regards, > > Peiyin Lee > > -- > > Gromacs Users mailing list > > > > * Please search the archive at > >
Re: [gmx-users] Gromacs 2018.3 Exceeding Memory Issue
Hi, On Tue, Nov 27, 2018 at 4:31 AM Peiyin Lee wrote: > Hi, all GROMACS users, > >I am trying to run jobs with Gromacs 2018.3 version and constantly got a > memory exceeding error. The system I ran is an all-atom system with 21073 > atoms. The largest file that is estimated to be generated is around 5.8 GB. > Estimate sizes of disk files don't matter here. > My jobs got constantly killed after running for only around 15 minutes and > got an error message like this: "slurmstepd: error: Job 12381762 exceeded > memory limit (123122052 > 12288), being killed". I have tried using a > 128MB is pretty tiny these days - no compute node will have less than 1GB physical memory, so I suggest to ask for that. GROMACS should never leak memory as the simulation progresses - if you think you are seeing that (e.g. with slightly larger memory limit, slurm interrupts a bit later) then we would like to see a bug report at https://redmine.gromacs.org > larger memory specification (12GB/core) but it would take too long to wait > and I don't think my job really uses that many memories. I have attached my > .mdp file as below: > "title = NVT Production Run for Trpzip4 in pure H2O > > define =; position restrain the protein > > ; Run parameters > > integrator = md ; leap-frog integrator > > nsteps = 5000 ; 0.002 * 5 = 10 ps (100 ns) > > dt = 0.002 ; 2 fs > > ; Output control > > nstenergy = 1 ; save energies every 20 ps > > nstlog = 1 ; update log file every 20 ps > > nstxout-compressed = 1 ; 20ps > > compressed-x-precision = 200 ; 0.05 > > compressed-x-grps = System > > ; Bond parameters > > continuation = yes ; Restarting after NVT > > constraint_algorithm = lincs ; holonomic constraints > > constraints = all-bonds ; all bonds (even heavy atom-H bonds) > constrained > > lincs_iter = 1 ; accuracy of LINCS > > lincs_order = 4 ; also related to accuracy > > ; Neighborsearching > > ns_type = grid ; search neighboring grid cels > > nstlist = 5 ; 10 fs > > rlist = 1.2 ; short-range neighborlist cutoff (in nm) > > rcoulomb = 1.2 ; short-range electrostatic cutoff (in nm) > > rvdw = 1.2 ; short-range van der Waals cutoff (in nm) > > ; Electrostatics > > coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics > > pme_order = 4 ; cubic interpolation > > fourierspacing = 0.16 ; grid spacing for FFT > > ; Temperature coupling is on > > tcoupl = V-rescale ; More accurate thermostat > > tc-grps = Protein SOL NA; 2 coupling groups - more accurate > Off topic, but it is not good practice to couple ions separately. Did you perhaps follow some tutorial that we can ask the author to fix? > tau_t = 0.5 0.5 0.5; time constant, in ps > > ref_t = 400 400 400 ; reference temperature, one for each group, in K > > ; Pressure coupling is on > > pcoupl = No ; Pressure coupling on in NPT > > pcoupltype = isotropic ; uniform scaling of x-y box vectors, > independent z > > tau_p = 5.0 ; time constant, in ps > > ref_p = 1.0 ; reference pressure, x-y, z (in bar) > > compressibility = 4.5e-5 ; isothermal compressibility, bar^-1 > > ; Periodic boundary conditions > > pbc = xyz ; 3-D PBC > > ; Dispersion correction > > DispCorr = EnerPres ; account for cut-off vdW scheme > > ; Velocity generation > gen_vel = no ; Velocity generation is off" > and the command I used to run was "mpirun -np 80 gmx_mpi mdrun -npme 16 > -noappend -s md.tpr -c md.gro -e md.edr -x md.xtc -cpi md.cpt -cpo md.cpt > -g md.log". Looks fine. I encourage everybody to use the default file names, and organize their projects into natural groups for the infrastructure, like directories. Renaming them doesn't add value and makes your life more complicated when you're doing restarts. Mark > This is my first time posting so please excuse anything that's > unclear. I will try to clarify if needed. Any help is greatly appreciated! > > Regards, > Peiyin Lee > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.