Dear Prof. Giannozzi, ah, I understand, that makes sense. Do you have any advice on how to best track such a memory leak down in this case? The behavior is reproducible with my setup.
Kind regards Lenz Am Mi., 23. Juni 2021 um 14:02 Uhr schrieb Paolo Giannozzi < p.gianno...@gmail.com>: > On Wed, Jun 23, 2021 at 1:31 PM Lenz Fiedler <fiedler.l...@gmail.com> > wrote: > > (and the increase in number of processes is most likely the reason for >> the error, if I understand you correctly?) >> > > not exactly. Too many processes may result in too much global memory > usage, because some arrays are replicated on each process. If you exceed > the global available memory, the code will crash. BUT: it will do so during > the first MD step, not after 2000 MD steps. The memory usage should not > increase with the number of MD time steps. If it does, there is a memory > leak, either in the code or somewhere else (libraries etc). > > Paolo > > this is not a problem. For my Beryllium calculation it is more problematic >> since the 144 processors case really gives the best performance (I have >> uploaded a file called performance_Be128.png to show my timing results), >> but I still run out of memory after 2700 time steps. Although this is also >> manageable, since I can always restart the calculation and perform another >> 2700 time steps. With this I was able to perform 10.000 time steps in just >> over a day. I am running more calculations on larger Be and Fe cells and I >> will investigate this behavior there. >> >> I have also used the "gamma" option for the K-points to use the >> performance benefits you outlined. For the Fe128 cell, I achieved optimal >> performance with 144 processors and using the "gamma" option (resulting in >> about 90s per SCF cycle). I am still not within my personal target of ~30s >> per SCF cycle but I will start looking into the choice of my PSP and cutoff >> (along with considering OpenMP and task group parallelization) rather than >> blindly throwing more and more processors at the problem. >> >> Kind regards >> Lenz >> >> PhD Student (HZDR / CASUS) >> >> >> Am Sa., 19. Juni 2021 um 09:25 Uhr schrieb Paolo Giannozzi < >> p.gianno...@gmail.com>: >> >>> I tried your Fe job on a 36-core machine (with Gamma point to save time >>> and memory) and found no evidence of memory leaks after more than 100 steps. >>> >>> The best performance I was able to achieve so far was with 144 cores >>>> defaulting to -nb 144, so am I correct to assume that I should try e.g. -nb >>>> 144 -ntg 2 for 288 cores? >>>> >>> >>> You should not use option -nb except in some rather special cases. >>> >>> Paolo >>> >>> >>> PhD Student (HZDR / CASUS) >>>> >>>> Am Mi., 16. Juni 2021 um 07:33 Uhr schrieb Paolo Giannozzi < >>>> p.gianno...@gmail.com>: >>>> >>>>> Hard to say without knowing exactly what goes out of which memory >>>>> limits. Note that not all arrays are distributed across processors, so a >>>>> considerable number of arrays are replicated on all processes. As a >>>>> consequence the total amount of required memory will increase with the >>>>> number of mpi processes. Also note that a 128-atom cell is not "large" and >>>>> 144 cores are not "a small number of processors". You will not get any >>>>> advantage by just increasing the number of processors any more, quite the >>>>> opposite. If you have too many idle cores, you should consider >>>>> - "task group" parallelization (option -ntg) >>>>> - MPI+OpenMP parallelization (configure --enable-openmp) >>>>> Please also note that ecutwfc=80 Ry is a rather large cutoff for a >>>>> USPP (while ecutrho=320 is fine) and that running with K_POINTS Gamma >>>>> instead of 1 1 1 0 0 0 will be faster and take less memory. >>>>> >>>>> Paolo >>>>> >>>>> On Mon, Jun 14, 2021 at 4:22 PM Lenz Fiedler <fiedler.l...@gmail.com> >>>>> wrote: >>>>> >>>>>> Dear users, >>>>>> >>>>>> I am trying to perform a MD simulation for a large cell (128 Fe >>>>>> atoms, gamma point) using pw.x and I get a strange scaling behavior. To >>>>>> test the performance I ran the same MD simulation with an increasing >>>>>> number >>>>>> of nodes (2, 4, 6, 8, etc.) using 24 cores per node. The simulation is >>>>>> successful when using 2, 4, and 6 nodes, so 48, 96 and 144 cores resp >>>>>> (albeit slow, which is within my expectations for such a small number of >>>>>> processors). >>>>>> Going to 8 and more nodes, I run into an out-of-memory error after >>>>>> about two time steps. >>>>>> I am a little bit confused as to what could be the reason. Since a >>>>>> smaller amount of cores works I would not expect a higher number of cores >>>>>> to run without an oom error as well. >>>>>> The 8 node run explictly outputs at the beginning: >>>>>> " Estimated max dynamical RAM per process > 140.54 MB >>>>>> Estimated total dynamical RAM > 26.35 GB >>>>>> " >>>>>> >>>>>> which is well within the 2.5 GB I have allocated for each core. >>>>>> I am obviously doing something wrong, could anyone point to what it >>>>>> is? >>>>>> The input files for a 6 and 8 node run can be found here: >>>>>> https://drive.google.com/drive/folders/1kro3ooa2OngvddB8RL-6Iyvdc07xADNJ?usp=sharing >>>>>> I am using QE6.6. >>>>>> >>>>>> Kind regards >>>>>> Lenz >>>>>> >>>>>> PhD Student (HZDR / CASUS) >>>>>> _______________________________________________ >>>>>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu) >>>>>> users mailing list users@lists.quantum-espresso.org >>>>>> https://lists.quantum-espresso.org/mailman/listinfo/users >>>>> >>>>> >>>>> >>>>> -- >>>>> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, >>>>> Univ. Udine, via delle Scienze 206, 33100 Udine, Italy >>>>> Phone +39-0432-558216, fax +39-0432-558222 >>>>> >>>>> _______________________________________________ >>>>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu) >>>>> users mailing list users@lists.quantum-espresso.org >>>>> https://lists.quantum-espresso.org/mailman/listinfo/users >>>> >>>> _______________________________________________ >>>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu) >>>> users mailing list users@lists.quantum-espresso.org >>>> https://lists.quantum-espresso.org/mailman/listinfo/users >>> >>> >>> >>> -- >>> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, >>> Univ. Udine, via delle Scienze 206, 33100 Udine, Italy >>> Phone +39-0432-558216, fax +39-0432-558222 >>> >>> _______________________________________________ >>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu) >>> users mailing list users@lists.quantum-espresso.org >>> https://lists.quantum-espresso.org/mailman/listinfo/users >> >> _______________________________________________ >> Quantum ESPRESSO is supported by MaX (www.max-centre.eu) >> users mailing list users@lists.quantum-espresso.org >> https://lists.quantum-espresso.org/mailman/listinfo/users > > > > -- > Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, > Univ. Udine, via delle Scienze 206, 33100 Udine, Italy > Phone +39-0432-558216, fax +39-0432-558222 > > _______________________________________________ > Quantum ESPRESSO is supported by MaX (www.max-centre.eu) > users mailing list users@lists.quantum-espresso.org > https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________ Quantum ESPRESSO is supported by MaX (www.max-centre.eu) users mailing list users@lists.quantum-espresso.org https://lists.quantum-espresso.org/mailman/listinfo/users