Slurm-Dev, Is there anything in the works to add the capacity of TotalCPU to also track the child process user and system time? I see that currently TotalCPU is defined: "provides a measure of the task’s parent process and does not include CPU time of child processes.” I ask this because it would be nice to profile how well the multi-core jobs are using the system, a sort of parallel efficiency measure. One could compare (wall time * cpus) to (FullCPUTotal) and understand if the users were “hogging” cores. Back in my LSF days, they had this Hog Factor that was something like this. Right now the only way I see to catch this is while it is happening on the cluster, not post job completion.
Cheers, ~Scott ================================== Dr. Scott Yockel | Senior Team Lead of HPC FAS Research Computing | Harvard University 38 Oxford Street Cambridge, MA Office: 211A | Phone: 617-496-7468 ==================================