Re: [slurm-users] [EXT] Re: [External] maxRSS and aveRSS

Prentice Bisbal Fri, 12 Mar 2021 16:39:30 -0800


On 3/12/21 6:37 PM, Sean Crosby wrote:

On Sat, 13 Mar 2021 at 08:48, Prentice Bisbal <pbis...@pppl.gov<mailto:pbis...@pppl.gov>> wrote:


    *
    *UoM notice: *External email. Be cautious of links, attachments,
    or impersonation attempts

    *
    ------------------------------------------------------------------------

    It sounds like your confusing job steps and tasks. For an MPI
    program, tasks and MPI ranks are the same thing. A slurm job has
    multiple steps. A single job step could have only 1 task, while
    another step in the same job can use 1,000 tasks.  When looking at
    the amount of memory for a job, the important number is the
    largest value of MaxRSS for all the job steps. Why this important?
    Because if you don't request at least this much with your --mem
    specification, your job may fail.

    Based on your definition, of aveRSS (I didn't go back and check
    the documentation myself), it sounds like you're doing unnecessary
    math, since I'm sure Slurm sums up the individual task max. RSS
    values for each task to get MaxRSS, and then divides that by the
    number of tasks to get the AveRSS.

This is incorrect. MaxRSS is the maximum amount of RAM the task thatused the most amount of RAM used. That is why there is then aMaxRSSNode and MaxRSSTask value. MaxRSSNode is the node the task thatused the most amount of RAM was on, and MaxRSSTask is the task ID ofthe task that used the most amount of RAM.

Thanks for the correction. That's what I originally thought, and thenread the definition he provided, which is exactly the same as in thedocumentation, and completely misinterpreted it. When I look at thesacct documentation and see that same definition in the context of allthe all the other MaxRSS values, it's clear I screwed up. Sorry!

SchedMD should reword that so even out of context it's clear what itrepresents.

When I read "Maximum resident set size of all tasks in job" Iautomatically thought "Maximum of the *sum* of the RSSes of each task.


Prentice

If you are trying to work out the RAM that the job as a whole used,use TRESUsageInTot


For a job on our cluster:

# sacct -j 24207294 -oJobID,Node,AveRSS,MaxRSS,MaxRSSTask,MaxRSSNode,TRESUsageInTot -p

JobID|NodeList|AveRSS|MaxRSS|MaxRSSTask|MaxRSSNode|TRESUsageInTot|
24207294.0|spartan-bm[055-056,058-059,061-062,085,091-093,096,098-099,104,108,112-117,120-124]|927811665|962245K|3|spartan-bm058|cpu=4784-18:38:23,energy=0,fs/disk=3555263283,mem=217455859K,pages=2438,vmem=434981656K|

This shows that AveRSS was 884MB, MaxRSS was task 3 running onspartan-bm058, which used 939MB, and all tasks in total used 212359MB

Also remember that --mem is a per node memory request. It is not a perjob or a per task memory request.


Sean

    On 3/9/21 3:41 AM, xiaojingh...@163.com
    <mailto:xiaojingh...@163.com> wrote:

    Hi guys,
    I would like to calculate the CPU efficiency and Memory efficiency of slurm 
jobs.

    I am having difficulty calculating the real “memory” a job use.
    According to slurm, “maxRSS” means "Maximum resident set size of all tasks in 
job”. If so, how can I get the memory used by a single job?  As far as I am concerned, if 
I need to know the memory used by a single job/jobstep, I need to sum up the memory used 
for each task. So I think  I should use the “aveRSS” field which gives the "average 
resident set size of all tasks in job”. If I multiply the “aveRSS” with “task”, I should 
get the real memory a job/jobstep used.

    But I studied the code of the “seff” command and it claims to be equivalent to 
"sacct -P -n -a --format 
JobID,User,Group,State,Cluster,AllocCPUS,REQMEM,TotalCPU,Elapsed,MaxRSS,ExitCode,NNodes,NTasks
 -j <job_id>”, which means I should use “maxRSS”.

    Can anyone give me some explanation on that?

    Very grateful for any help.
    Thank you!

    Regards,
    Xiaojing

Re: [slurm-users] [EXT] Re: [External] maxRSS and aveRSS

Reply via email to