Re: [slurm-users] Limit nodes of a partition without managing users

Brian Andrus Tue, 18 Aug 2020 08:53:54 -0700

Without preemption there is no way in your scenario to ensure all nodeswon't be used by "high priority" jobs merely because it is(theoretically) possible for all jobs to be high priority until thecluster is full before the first low priority job is submitted. Youwould need to decide how to handle the situation when that happens.

To get at how to deal with it, separate what your users consider "highpriority" from what slurm considers "priority" (which is an actualnumber used to schedule jobs).

If you manipulate the slurm priority number of a job, you can place itwherever you like in the queue waiting to run. So a "low priority" jobto you can have a priority number that is quite high to slurm so it runsnext.

In this manner, you can do things to any of the jobs in the queue toplace them in the order you want.

Using that approach, you may want to just look at the multifactorpriority plugin (https://slurm.schedmd.com/priority_multifactor.html).In particular the Age and Partition factors. Maybe make "low priority"jobs gain priority faster than high priority jobs. So, they may wait,but they will wait a relatively shorter amount of time. There arenumerous other factors you can use. If you have accounting andassociations configured, you can manipulate it all the way to theassociation and qos.


Brian Andrus


On 8/17/2020 11:23 PM, Gerhard Strangar wrote:

Brian Andrus wrote:

Most likely, but the specific approach depends on how you define what
you want.

My idea was "high prio job is next unless are are too many of them".

For example, what if there are no jobs in high pri queue but many in
low? Should all the low ones run?

Yes.

What should happen if they get started
and use all the nodes and a high-pri request comes in (preemption policy)?

No preemption.

What about the inverse of that?

The inverse of what? All nodes being used by high prio jobs? That's
exactly what I want to avoid.

What if you get a steady stream of
high-pri jobs? How long should low-pri wait before being allowed to run?

As long as it takes. Since I'm trying to avoid high prio jobs consuming
all nodes, it won't take forever. :-)

Does it matter if it is all the same user?1

No.

You can handle much of that type of interaction with job priorities and
a single queue. As you can see, the devil is in the details on how to
define/get what you want.

How do you make sure the single partition doesn't run high prio jobs
only if there's a sufficient amout of those?

Gerhard

Re: [slurm-users] Limit nodes of a partition without managing users

Reply via email to