Re: [slurm-users] tie a reservation to a QoS?

Bill Wichser Mon, 28 Oct 2019 08:46:42 -0700

One thing we changed years ago was to think about things differently.While researchers are in fact buying nodes for the cluster, it's rarelythe case that they get any rights to "their" nodes. Instead they arebuying CPU time in an equivalent way but it gets averaged over 30 days.

We provide reports, explain how fairshare works and how their particularvalue was chosen, give them all the ways that we as sysadmins look atthe data, and for the most part have accepted that buy-in means CPUhours/month rather than a certain piece of hardware.

Now that journey was not easy. And there is a discussion with every newresearcher who wants a piece of the cluster. But it works for us. Butthere is a big caveat. There needs to be at least a 10% share of thecluster which is owned by a public portion.

This gives some leeway into scheduling and allows others who have notcontributed to have access to the same resources albeit at a much lowerpriority. And that 10% for us has become more like 25% or even more aswe have a large and ever growing base of users who have not contributed.

This different way of thinking has made dedicated partitions and QOSessomething we have not had to deal with as CPU time per 30 day slidingwindow has been accepted, can be quantitatively shown, and just is amuch easier way to schedule when ALL resources can be used.


Bill

On 10/28/19 11:11 AM, Tina Friedrich wrote:

Hello,

is there a possibility to tie a reservation to a QoS (instead of an
account or user), or enforce a QoS for jobs submitted into a reservation?

The problem I'm trying to solve is - some of our resources are bought on
a co-investment basis. As part of that, the 'owning' group can get very
high scheduling priority (via a QoS) on an equivalent amount of
resource. Additionally, they have a number of reservations for 'their'
nodes they can request per year. However, that lends itself to gaming
the system - they can now submit jobs into the reservation with 'normal'
priority, and then run jobs on the rest of the cluster using the higher
priority - really not the plan.

Basically, I need a way to ensure that - even when a reservation is in
place - those groups 'use up' their priority resources first & then all
other jobs they submit are run with 'lower' priority.

I'm currently dealing with it by modifying the QoS every time a
reservation is created. But that isn't really sustainable on an ongoing
basis - this isn't a one-off for one group, it's part of our operations
model, and there's a (growing) number of them.

One (easy) way I can see is if I had a way to ensure you can not use a
reservation without using the resp. priority QoS - however, from my
reading of the docs there's no way to do that. (As only the one account
has access to the QoS, being able to tie a reservation to a QoS would
sort of solve my problem :) ).

Any ideas? The only thing I can come up involves a lot of scripting, and
it would certainly not be a bit error prone (and not the most flexible).

Tina

Re: [slurm-users] tie a reservation to a QoS?

Reply via email to