Dear All,

We have a HPC cluster with Slurm job scheduler (17.02.8). There are several 
private partitions (which are sponsored by several groups) and a "common" 
partition. Private partitions are exclusively used by those private users, and 
all users (including private users) have equal access to the "common" 
partition. Currently we have set up  Multi-factor Job Priority plugin to 
determine job priority. Everything is working ok except that private users are 
less favourable in the fair-share factor than non-private users when using  the 
"common" partition.  Private users may run many jobs on their own private 
partitions, and occasionally run jobs on the "common" partition. When 
submitting jobs to the "common" partition, those private users normally get 
much lower fair-share factor in job priority compared with non-private users. 
It appears those private users are penalized because they have already used a 
large amount of resources on the cluster (even those resources are private 
owned). In Fairshare Algorithm( 
https://slurm.schedmd.com/classic_fair_share.html), it appears when calculating 
Uuser and Utotal, the consumed processor*seconds is based on the whole cluster. 
Can we exclude some partitions when calculating such consumed resources? Is 
such functionality available in Slurm?

I have tried to set TRESBillingWeights="CPU=0.0,Mem=0.0,GRES/gpu=0.0" in the 
private partitions. It seems such settings only affect the TRES factor, NOT 
Fair-share factor.

Any suggestion?

Thanks
Manhui

Reply via email to