Re: [slurm-users] [External] Re: Partition question

Ole Holm Nielsen Thu, 19 Dec 2019 12:27:10 -0800

Some examples are here:
https://wiki.fysik.dtu.dk/niflheim/Slurm_accounting#quality-of-service-qos


/Ole

On 19-12-2019 19:30, Prentice Bisbal wrote:

On 12/19/19 10:44 AM, Ransom, Geoffrey M. wrote:
The simplest is probably to just have a separate partition that willonly allow job times of 1 hour or less.
This is how our Univa queues used to work, by overlapping the samehardware. Univa shows available “slots” to the users and we had a lotof confused users complaining about all those free slots (busy slotsin the other queue) while their jobs sat on the queue and new usersconfused as to why their jobs were being killed after 4 hours. I wasable to move the short/long behavior to job classes and use RQSes andhave one queue.
While slurm isn’t showing users unused resources I am concerned thatgoing back to two queues (partitions) will cause user interaction andadoption problems.
         It all depends on what best suits the specific needs.
Is there a way to have one partition that holds aside a smallpercentage of resources for jobs with a runtime under 4 hours, i.e.jobs with long runtimes cannot tie up 100% of the resources at onetime? Some kind of virtual partition that feeds into two otherpartitions based on runtime would also work. The goal is that userscan continue to post jobs to one partition but the scheduler won’t let100% of the compute resources get tied up with mutli-week long jobs.
The way to do this is with Quality of Service (QOS) in Slurm. Whencreating a QOS, you can specify the max. number of tasks a QOS can use.Create a QOS for the longer running jobs and set the MaxGrpTRES so thatthe number of CPUs is less that 100% of your cluster. Create a QOS forthe shorter jobs with a shorter time limit (MaxWall).
Once the QOSes are setup, you can instruct your users to specify theproper QOS when submitting a job, or edit the job_submit.lua script tolook at the time limit specified, and assign/override the QOS based onthat.

Re: [slurm-users] [External] Re: Partition question

Reply via email to