Re: [Pw_forum] Memory usage estimate of the calculation
Yes, those are "Megabytes", not "Megabits" Paolo On Thu, Jan 5, 2017 at 12:25 AM, Jun Jiang wrote: > Dear All, > > I was running the qe6.0 (pw.x) > I find the memory usage estimate in the output files like below > " > > Estimated max dynamical RAM per process > 3207.98Mb > Estimated total allocated dynamical RAM > 153983.19Mb > " > To make full use of the RAM and CPU, is it means if I allocated 3208 MB > /cpu and 153984 MB in total for the job, it would be enough for this > calculation ? > If the real RAM usage would be larger than that, how much should I add to > this estimation or how to estimated the other part ? > > PS: Is this "Mb" in the code means Megabyte(MB) or Megabit(Mb), I think it > should be Megabyte(MB). > > Thanks, > Jun Jiang > > ___ > Pw_forum mailing list > Pw_forum@pwscf.org > http://pwscf.org/mailman/listinfo/pw_forum > -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222 ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum
[Pw_forum] Memory usage estimate of the calculation
Dear All, I was running the qe6.0 (pw.x) I find the memory usage estimate in the output files like below " Estimated max dynamical RAM per process > 3207.98Mb Estimated total allocated dynamical RAM > 153983.19Mb " To make full use of the RAM and CPU, is it means if I allocated 3208 MB /cpu and 153984 MB in total for the job, it would be enough for this calculation ? If the real RAM usage would be larger than that, how much should I add to this estimation or how to estimated the other part ? PS: Is this "Mb" in the code means Megabyte(MB) or Megabit(Mb), I think it should be Megabyte(MB). Thanks, Jun Jiang ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum
[Pw_forum] Memory usage by pw.x
I'm sorry for a previous incomplete message. > sed -ri "s/(^ *)(allocate.*$)/\1\2\n\1 CALL mem_whatever()/i" $(find > /where/is/espresso -name \*.f90) That was very useful, thank you. I now get 1.9GB out of 2.4, which starts giving some usable estimate, but I do understand that getting the accurate value is very complex. To my purpose, I'll monitor the memory occupancy by a script, which follows below in case someone finds it useful... Guido command=pw.x maxsecs=$((60*60*24)) delay=1 nsteps=$((maxsecs/delay)) echo "#when RAM PID (0 for all $command instances)"} for ((i=0;((i
[Pw_forum] Memory usage by pw.x
This was indeed very useful, thank you. I got 1.9GBMb out of about 2400 On 08/30/2012 04:34 PM, Lorenzo Paulatto wrote: > sed -ri "s/(^ *)(allocate.*$)/\1\2\n\1 CALL mem_whatever()/i" $(find > /where/is/espresso -name \*.f90) -- Guido Fratesi Dipartimento di Scienza dei Materiali Universita` degli Studi di Milano-Bicocca via Cozzi 53, 20125 Milano, Italy Phone: +39 02 6448 5183 email: fratesi at mater.unimib.it
[Pw_forum] Memory usage by pw.x
On Thu, 30 Aug 2012 16:34:03 +0200, Lorenzo Paulatto wrote: > On 30 August 2012 15:54, Guido Fratesi wrote: > >> Yet in my test, the max memory printed by top is 2.3GB within the first >> step of the SCF cycle, but the standard call to memstat in electrons.f90 >> returned 744.1 Mb and the one tracked as described above 1168.572 Mb >> (maximum reached earlier than that 744.1 Mb). >> >> > Measuring the amount of memory in the "clock" subroutines is far from > optimal, as they are usually called before temporary variables are > allocated (start)clock) and after they are deallocated (stop_clock). > > With a command like this: > sed -ri "s/(^ *)(allocate.*$)/\1\2\n\1 CALL mem_whatever()/i" $(find > /where/is/espresso -name \*.f90) > > You can add a call to mem_whatever after *every* allocate in the entire > code. This will also modify all your f90 files, so I suggest making a > backup first. > This should result in a quite accurate report of memory consumption (at > the > cost of a certain performance hit, I guess). You could use valgrind with the 'massif' tool. This should be able to tell you in which routine the memory usage peaks and thus where is best to put your memory check. Running qe under valgrind can be a very slow process though... Simon -- Simon Binnie | Post Doc, Condensed Matter Sector Scuola Internazionale di Studi Avanzati (SISSA) Via Bonomea 256 | 34100 Trieste | sbinnie at sissa.it
[Pw_forum] Memory usage by pw.x
On 30 August 2012 15:54, Guido Fratesi wrote: > Yet in my test, the max memory printed by top is 2.3GB within the first > step of the SCF cycle, but the standard call to memstat in electrons.f90 > returned 744.1 Mb and the one tracked as described above 1168.572 Mb > (maximum reached earlier than that 744.1 Mb). > > Measuring the amount of memory in the "clock" subroutines is far from optimal, as they are usually called before temporary variables are allocated (start)clock) and after they are deallocated (stop_clock). With a command like this: sed -ri "s/(^ *)(allocate.*$)/\1\2\n\1 CALL mem_whatever()/i" $(find /where/is/espresso -name \*.f90) You can add a call to mem_whatever after *every* allocate in the entire code. This will also modify all your f90 files, so I suggest making a backup first. This should result in a quite accurate report of memory consumption (at the cost of a certain performance hit, I guess). Eventual allocations in c files and external libraries could still escape. I.e. the FFT library could decide to allocate a temporary array of 1GB, you will not see this in the final report. I suggest putting the subroutine mem_whatever somewher in flib/ and without a Module, otherwise you'll be forced to include also a USE in every file, which can be annoying. bests -- Lorenzo Paulatto IdR @ IMPMC/CNRS & Universit? Paris 6 phone: +33 (0)1 44275 084 / skype: paulatz www: http://www-int.impmc.upmc.fr/~paulatto/ mail: 23-24/4?16 Bo?te courrier 115, 4 place Jussieu 75252 Paris C?dex 05 -- next part -- An HTML attachment was scrubbed... URL: http://www.democritos.it/pipermail/pw_forum/attachments/20120830/45925cc4/attachment.htm
[Pw_forum] Memory usage by pw.x
On Thu, 2012-08-30 at 15:54 +0200, Guido Fratesi wrote: > Yet in my test, the max memory printed by top is 2.3GB within the first > step of the SCF cycle, but the standard call to memstat in electrons.f90 > returned 744.1 Mb and the one tracked as described above 1168.572 Mb my (limited) understanding is that the internal call to "memstat" reports only the dynamically allocated memory; "top" reports all memory taken by the process, including shared libraries and whatnot. It seems to me that the difference, (2.3-1.2)Gb=1.1Gb, is a lot of memory, but I have no idea how to figure out where all this memory come from (or goes to). This is stuff for OS wizards. P. -- Paolo Giannozzi, IOM-Democritos and University of Udine, Italy
[Pw_forum] Memory usage by pw.x
> What about tracking the maximum and the minimum recorded within an > entire SCF loop by sampling the memory occupancy where a clock (start or > stop) is triggered? I tried this possibility, defining the routines below in clocks.f90 and the variable max_ram_kb (stored for laziness in "mytime"). Then, I call set_max_tracked_ram at every start/stop of a clock. I would expect this to work, since for example h_psi is called within the diagonalization cycles where workspaces and (I expect) most arrays have been already allocated. Yet in my test, the max memory printed by top is 2.3GB within the first step of the SCF cycle, but the standard call to memstat in electrons.f90 returned 744.1 Mb and the one tracked as described above 1168.572 Mb (maximum reached earlier than that 744.1 Mb). I report here the subroutines I used, they are trivial but... SUBROUTINE set_max_tracked_ram () USE mytime, ONLY : max_ram_kb IMPLICIT NONE INTEGER :: kilobytes CALL memstat ( kilobytes ) IF ( kilobytes > max_ram_kb+50 ) THEN max_ram_kb = kilobytes CALL write_max_tracked_ram () END IF END SUBROUTINE set_max_tracked_ram ! SUBROUTINE write_max_tracked_ram () USE io_global,ONLY : stdout USE mytime, ONLY : max_ram_kb IMPLICIT NONE WRITE( stdout, 9001 ) max_ram_kb/1000.0 9001 FORMAT(/' XXX per-process dynamical memory: ',f7.1,' Mb' ) END SUBROUTINE write_max_tracked_ram Guido -- Guido Fratesi Dipartimento di Scienza dei Materiali Universita` degli Studi di Milano-Bicocca via Cozzi 53, 20125 Milano, Italy
[Pw_forum] Memory usage by pw.x
guido, On Thu, Aug 30, 2012 at 12:04 PM, Guido Fratesi wrote: > I'm sorry for a previous incomplete message. > >> sed -ri "s/(^ *)(allocate.*$)/\1\2\n\1 CALL mem_whatever()/i" $(find >> /where/is/espresso -name \*.f90) > > That was very useful, thank you. > > I now get 1.9GB out of 2.4, which starts giving some usable estimate, > but I do understand that getting the accurate value is very complex. quantifying memory usage precisely on unix/linux machines with virtual memory management and memory sharing is almost impossible. you have multiple components to worry about: - address space (memory reserved to be used, but initially all mapped to the same copy-on-write location) - resident set size (actual physical memory used) - shared memory (it is not memory that is shared, but more a measure for how much sharing is going on) - swap space. - device memory (from infiniband cards for example) - pinned memory (allocated memory that cannot be swapped, usually used to back device memory) so what is the real memory usage is difficult to determine. address space (VMEM) is usually too large, resident set size (RSS) does not consider memory that is swapped out, so it is often too small. using many MPI tasks drives up the address space for device memory (which doesn't increase real memory usage, but also requires more pinned memory, which makes swapping more likely). multi-threading results in a lot of sharing. ...and tracking allocations in the code only handles explicit allocations, not those incurred by the fortran language on the stack or otherwise. so for all practical purposes you can say that memory use is usually somewhere between VMEM and RSS, but that can be pretty far apart. axel. > To my purpose, I'll monitor the memory occupancy by a script, which > follows below in case someone finds it useful... > Guido > > > > command=pw.x > > maxsecs=$((60*60*24)) > delay=1 > nsteps=$((maxsecs/delay)) > > echo "#when RAM PID (0 for all $command instances)"} > > for ((i=0;((i#timer=`date +%H:%M:%S` >#timer=`date +"%H:%M:%S %s.%N"` >timer=`date +"%s.%N"` >ps -eo comm,rss,pid | > awk -v comm=$command -v timer=$timer ' >BEGIN {tot=0} >($1==comm) { > tot+=$2; > print timer, $2, $3; >} >END {if (tot) print timer, tot, 0} > ' >sleep 1 > done > > > > -- > Guido Fratesi > > Dipartimento di Scienza dei Materiali > Universita` degli Studi di Milano-Bicocca > via Cozzi 53, 20125 Milano, Italy > > ___ > Pw_forum mailing list > Pw_forum at pwscf.org > http://www.democritos.it/mailman/listinfo/pw_forum -- Dr. Axel Kohlmeyer akohlmey at gmail.com http://goo.gl/1wk0 International Centre for Theoretical Physics, Trieste. Italy.
[Pw_forum] Memory usage by pw.x
On Aug 29, 2012, at 3:24 PM, Guido Fratesi wrote: > Let me guess that > calling again memstat later on, eg in c_bands, could provide one a more > precise estimate, but maybe you can suggest a better approach, or > correct me if I'm completely wrong. I think you need to put several calls in several part of the code in order to have an understanding how much memory is going to be allocated. It will likely happen that the memory occupancy increases than decreases than increases again and so on. It is possible to incorporate a memory monitoring within the clock module BUT the amount of output that can be printed is huge! What about tracking the maximum and the minimum recorded within an entire SCF loop by sampling the memory occupancy where a clock (start or stop) is triggered? Cheers, Filippo -- Mr. Filippo SPIGA, M.Sc., Ph.D. Candidate CADMOS - Chair of Numerical Algorithms and HPC (ANCHP) ?cole Polytechnique F?d?rale de Lausanne (EPFL) http://anchp.epfl.ch ~ http://filippospiga.me ~ skype: filippo.spiga ?Nobody will drive us out of Cantor's paradise.? ~ David Hilbert -- next part -- An HTML attachment was scrubbed... URL: http://www.democritos.it/pipermail/pw_forum/attachments/20120829/cf053c5a/attachment-0001.htm
[Pw_forum] Memory usage by pw.x
Hi Guido > I'm trying to quantify the memory usage by pw.x good luck. Understand how much memory a code really uses is highly nontrivial, due to the way modern operating systems work (shared libraries, files kept in RAM, ...) > at the beginning of the SCF cycle the message "per-process > dynamical memory:" reports the memory allocated at that time > (clib/memstat.c), but this value is significantly less than the one > I can see by monitoring the process by the "top" command, as > more memory is allocated afterwards. Let me guess that calling > again memstat later on, eg in c_bands, could provide one a more > precise estimate, but maybe you can suggest a better approach > I cannot P. --- Paolo Giannozzi, Dept of Chemistry&Physics&Environment, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222
[Pw_forum] Memory usage by pw.x
Dear all, I'm trying to quantify the memory usage by pw.x: at the beginning of the SCF cycle the message "per-process dynamical memory:" reports the memory allocated at that time (clib/memstat.c), but this value is significantly less than the one I can see by monitoring the process by the "top" command, as more memory is allocated afterwards. Let me guess that calling again memstat later on, eg in c_bands, could provide one a more precise estimate, but maybe you can suggest a better approach, or correct me if I'm completely wrong. Thank you in advance, Guido -- Guido Fratesi Dipartimento di Scienza dei Materiali Universita` degli Studi di Milano-Bicocca via Cozzi 53, 20125 Milano, Italy
[Pw_forum] Memory usage in Quantum Espresso
On Mar 10, 2011, at 22:29 , Krukau, Aliaksandr wrote: > The user guide mentions that old compilers can reduce the size > of the system that PWSCF can treat. But I use relatively new > version 9.0 of Portland group Fortran compilers. Either your compiler is buggy, or it requires some specific options to run a relatively large code like PWscf. If you paid real money for the PGI compiler, complain with the vendor. In the meantime, try a recent version of gfortran or a non-buggy version of the Intel compiler (latest v.11 should be ok) P. --- Paolo Giannozzi, Dept of Chemistry&Physics&Environment, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222
[Pw_forum] Memory usage in Quantum Espresso
Dear QE users, My desktop has Intel Core 2 Quad processor and 4GB RAM. However, when I run Quantum Espresso (QE) 4.1, the biggest PWSCF jobs that I manage to run use about 200MB of RAM. For bigger systems or tighter cutoffs, the calculations crash with a segmentation violation. Moreover, if I use QE 4.2.1, the calculation crash even when I use more than ~100MB. So I am restricted to the quite small systems, especially if I use the latest 4.2.1 version. The user guide mentions that old compilers can reduce the size of the system that PWSCF can treat. But I use relatively new version 9.0 of Portland group Fortran compilers. If I run 'limit' command, it shows 'memoryuse unlimited; stacksize unlimited'. I apologize for the naive questions, but why can PWSCF use only a small fraction of my total RAM? Or the line "per-process dynamical memory" in the output file does not show the total required memory? Is there a way to run bigger calculations without using parallel execution (maybe by compiler flags etc.)? Best regards, Alex Krukau, Indiana University
[Pw_forum] memory usage
On Feb 16, 2007, at 12:27 , Marcel Mohr wrote: > how can i print or estimate the memory a calculation would need? estimate: there is a subsection "Memory requirements" in the user's guide. It is not very detailed but it gives you an idea. Print: there is a routine "memstat" that can be called after the initialization phase and after largest arrays have been allocated (you could try to call it after line WRITE( stdout, 9000 ) get_clock( 'PWSCF' ) in electrons.f90). It calculates (linux, aix, and a few other OS) the size of the dynamically allocated memory only. It is not the maximum memory size because a nonnegligible amount of memory is allocated later, during self-consistency. Keeping track on how much memory is really used is a nontrivial task. Paolo --- Paolo Giannozzi, Democritos and University of Udine, Italy
[Pw_forum] memory usage
Dear list-members I have the suspicion that all of you have supercomputers where memory usage is incidental. OK, not really, but how can i print or estimate the memory a calculation would need? (In older versions there was memory.x, and later the routine show_memory). Kind regards Marcel