I tried your Fe job on a 36-core machine (with Gamma point to save time and memory) and found no evidence of memory leaks after more than 100 steps.
The best performance I was able to achieve so far was with 144 cores > defaulting to -nb 144, so am I correct to assume that I should try e.g. -nb > 144 -ntg 2 for 288 cores? > You should not use option -nb except in some rather special cases. Paolo PhD Student (HZDR / CASUS) > > Am Mi., 16. Juni 2021 um 07:33 Uhr schrieb Paolo Giannozzi < > p.gianno...@gmail.com>: > >> Hard to say without knowing exactly what goes out of which memory limits. >> Note that not all arrays are distributed across processors, so a >> considerable number of arrays are replicated on all processes. As a >> consequence the total amount of required memory will increase with the >> number of mpi processes. Also note that a 128-atom cell is not "large" and >> 144 cores are not "a small number of processors". You will not get any >> advantage by just increasing the number of processors any more, quite the >> opposite. If you have too many idle cores, you should consider >> - "task group" parallelization (option -ntg) >> - MPI+OpenMP parallelization (configure --enable-openmp) >> Please also note that ecutwfc=80 Ry is a rather large cutoff for a USPP >> (while ecutrho=320 is fine) and that running with K_POINTS Gamma instead of >> 1 1 1 0 0 0 will be faster and take less memory. >> >> Paolo >> >> On Mon, Jun 14, 2021 at 4:22 PM Lenz Fiedler <fiedler.l...@gmail.com> >> wrote: >> >>> Dear users, >>> >>> I am trying to perform a MD simulation for a large cell (128 Fe atoms, >>> gamma point) using pw.x and I get a strange scaling behavior. To test the >>> performance I ran the same MD simulation with an increasing number of nodes >>> (2, 4, 6, 8, etc.) using 24 cores per node. The simulation is successful >>> when using 2, 4, and 6 nodes, so 48, 96 and 144 cores resp (albeit slow, >>> which is within my expectations for such a small number of processors). >>> Going to 8 and more nodes, I run into an out-of-memory error after about >>> two time steps. >>> I am a little bit confused as to what could be the reason. Since a >>> smaller amount of cores works I would not expect a higher number of cores >>> to run without an oom error as well. >>> The 8 node run explictly outputs at the beginning: >>> " Estimated max dynamical RAM per process > 140.54 MB >>> Estimated total dynamical RAM > 26.35 GB >>> " >>> >>> which is well within the 2.5 GB I have allocated for each core. >>> I am obviously doing something wrong, could anyone point to what it is? >>> The input files for a 6 and 8 node run can be found here: >>> https://drive.google.com/drive/folders/1kro3ooa2OngvddB8RL-6Iyvdc07xADNJ?usp=sharing >>> I am using QE6.6. >>> >>> Kind regards >>> Lenz >>> >>> PhD Student (HZDR / CASUS) >>> _______________________________________________ >>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu) >>> users mailing list users@lists.quantum-espresso.org >>> https://lists.quantum-espresso.org/mailman/listinfo/users >> >> >> >> -- >> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, >> Univ. Udine, via delle Scienze 206, 33100 Udine, Italy >> Phone +39-0432-558216, fax +39-0432-558222 >> >> _______________________________________________ >> Quantum ESPRESSO is supported by MaX (www.max-centre.eu) >> users mailing list users@lists.quantum-espresso.org >> https://lists.quantum-espresso.org/mailman/listinfo/users > > _______________________________________________ > Quantum ESPRESSO is supported by MaX (www.max-centre.eu) > users mailing list users@lists.quantum-espresso.org > https://lists.quantum-espresso.org/mailman/listinfo/users -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 206, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222
_______________________________________________ Quantum ESPRESSO is supported by MaX (www.max-centre.eu) users mailing list users@lists.quantum-espresso.org https://lists.quantum-espresso.org/mailman/listinfo/users