[slurm-users] Areas for improvement on our site's cluster scheduling

Jonathon A Anderson Mon, 07 May 2018 21:43:52 -0700

We have two main issues with our scheduling policy right now. The first is an 
issue that we call "queue stuffing." The second is an issue with interactive 
job availability. We aren't confused about why these issues exist, but we 
aren't sure the best way to address them.


I'd love to hear any suggestions on how other sites address these issues. 
Thanks for any advice!


## Queue stuffing

We use multifactor scheduling to provide account-based fairshare scheduling as 
well as standard fifo-style job aging. In general, this works pretty well, and 
accounts meet their scheduling targets; however, every now and again, we have a 
user who has a relatively high-throughput (not HPC) workload that they're 
willing to wait a significant period of time for. They're low-priority work, 
but they put a few thousand jobs into the queue, and just sit and wait. 
Eventually the job aging makes the jobs so high-priority, compared to the 
fairshare, that they all _as a set_ become higher-priority than the rest of the 
work on the cluster. Since they continue to age as the other jobs continue to 
age, these jobs end up monopolizing the cluster for days at a time, as their 
high volume of relatively small jobs use up a greater and greater percentage of 
the machine.

In Moab I'd address this by limiting the number of jobs the user could have 
*eligible* at any given time; but it appears that the only option for slurm is 
limiting the number of jobs a user can *submit*, which isn't as nice a user 
experience and can lead to some pathological user behaviors (like users running 
cron jobs that wake repeatedly and submit more jobs automatically).


## Interactive job availability

I'm becoming increasingly convinced that holding some portion of our resource 
aside as dedicated for relatively short, small, interactive jobs is a unique 
good; but I'm not sure how best to implement it. My immediate thought was to 
use a reservation with the DAILY and REPLACE flags. I particularly like the 
idea of using the REPLACE flag here as we could keep a flexible amount of 
resources available irrespective of how much was actually being used for the 
purpose at any given time; but it doesn't appear that there's any way to limit 
the per-user use of resources *within* a reservation; so if we created such a 
reservation and granted all users access to it, any individual user would be 
capable of consuming all resources in the reservation anyway. I'd have a 
dedicated "interactive" qos or similar to put such restrictions on; but there 
doesn't appear to be a way to then limit the use of the reservation to only 
jobs with that qos. (Aside from job_submit scripts or similar. Please correct 
me if I'm wrong.)

In lieu of that, I'm leaning towards having a dedicated interactive partition 
that we'd manually move some resources to; but that's a bit less flexible.

[slurm-users] Areas for improvement on our site's cluster scheduling

Reply via email to