Without preemption there is no way in your scenario to ensure all nodes
won't be used by "high priority" jobs merely because it is
(theoretically) possible for all jobs to be high priority until the
cluster is full before the first low priority job is submitted. You
would need to decide how to handle the situation when that happens.
To get at how to deal with it, separate what your users consider "high
priority" from what slurm considers "priority" (which is an actual
number used to schedule jobs).
If you manipulate the slurm priority number of a job, you can place it
wherever you like in the queue waiting to run. So a "low priority" job
to you can have a priority number that is quite high to slurm so it runs
next.
In this manner, you can do things to any of the jobs in the queue to
place them in the order you want.
Using that approach, you may want to just look at the multifactor
priority plugin (https://slurm.schedmd.com/priority_multifactor.html).
In particular the Age and Partition factors. Maybe make "low priority"
jobs gain priority faster than high priority jobs. So, they may wait,
but they will wait a relatively shorter amount of time. There are
numerous other factors you can use. If you have accounting and
associations configured, you can manipulate it all the way to the
association and qos.
Brian Andrus
On 8/17/2020 11:23 PM, Gerhard Strangar wrote:
Brian Andrus wrote:
Most likely, but the specific approach depends on how you define what
you want.
My idea was "high prio job is next unless are are too many of them".
For example, what if there are no jobs in high pri queue but many in
low? Should all the low ones run?
Yes.
What should happen if they get started
and use all the nodes and a high-pri request comes in (preemption policy)?
No preemption.
What about the inverse of that?
The inverse of what? All nodes being used by high prio jobs? That's
exactly what I want to avoid.
What if you get a steady stream of
high-pri jobs? How long should low-pri wait before being allowed to run?
As long as it takes. Since I'm trying to avoid high prio jobs consuming
all nodes, it won't take forever. :-)
Does it matter if it is all the same user?1
No.
You can handle much of that type of interaction with job priorities and
a single queue. As you can see, the devil is in the details on how to
define/get what you want.
How do you make sure the single partition doesn't run high prio jobs
only if there's a sufficient amout of those?
Gerhard