All,

I'm collecting some usage metrics for our cluster, and I'd like to look at 
utilisation in terms of allocated CPU % by partition, basically equivalent of 
`sinfo -O cpusstate -p partition_name`, but for historic data. What's the best 
way to do this? 

I've found that running `sacct --allusers --state=RUNNING` and summing 
allocated CPUs seems to give the same results as `sinfo`, but when feeding in 
an an explicit starttime/endtime parameter it's not so clear. My naive approach 
of simply adding up allocated cored over say a 5min window seems to give a 
lower value than sinfo. Does this command syntax capture jobs that were running 
the entire window, or just at some point? Can I query an instantaneous time, 
rather than window? Am I missing something else?

I've played with the `sreport` command as well, but that doesn't seem to allow 
specifying a specific partition to analyse. Once I've got the general pattern 
down, I'l like to analyse by other job characteristics too (e.g. 
single/multicore).

Appreciate any guidance!

Cheers,

David

Reply via email to