Slurm-Dev,

Is there anything in the works to add the capacity of TotalCPU to also track 
the child process user and system time?  I see that currently TotalCPU is 
defined: "provides a measure of the task’s parent process and does not include 
CPU time of child processes.”  I ask this because it would be nice to profile 
how well the multi-core jobs are using the system, a sort of parallel 
efficiency measure.   One could compare  (wall time * cpus) to (FullCPUTotal) 
and understand if the users were “hogging” cores.  Back in my LSF days, they 
had this Hog Factor that was something like this.  Right now the only way I see 
to catch this is while it is happening on the cluster, not post job completion. 
 

Cheers,

~Scott
==================================
Dr. Scott Yockel | Senior Team Lead of HPC
FAS Research Computing | Harvard University
38 Oxford Street Cambridge, MA
Office: 211A | Phone: 617-496-7468
==================================

Reply via email to