[slurm-dev] Thoughts on GrpCPURunMins as primary constraint?

Corey Keasling Mon, 24 Jul 2017 13:39:01 -0700


Hi Slurm-Dev,

I'm currently designing and testing what will ultimately be a smallSlurm cluster of about 60 heterogeneous nodes (five differentgenerations of hardware). Our user-base is also diverse, with need forfast turnover of small, sequential jobs and for long-duration parallelcodes (e.g., 16 cores for several months).

In the past we limited users by how many cores they could allocate atany one time. This has the drawback that no distinction is madebetween, say, 128 cores for 2 hours and 128 cores for 2 months. We wantusers to be able to run on a large portion of the cluster when it isavailable while ensuring that they cannot take advantage of an idleperiod to start jobs which will monopolize it for weeks.

Limiting by GrpCPURunMins seems like a good answer. I think of it asallocating computational area (i.e., cores*minutes) and not just width(cores). I'd love to know if anyone has any experience or thoughts onimposing limits in this way. Also, is anyone aware of a simple way tocalculate remaining "area"? I can use squeue or sacct to ultimatelyderive how much of a limit is in use by looking at remaining wall-timeand core count, but if there's something built in - or pre-existing - itwould be nice to know.

It's worth noting that the cluster is divided into several partitionswith most nodes existing in several. This is partially political (togive groups increased priority on nodes they helped pay for) andpartially practical (to ensure users explicitly requesting slow nodesinstead of just dumping them on ancient Opterons). Also, each user getstheir own Account, so the QoS Grp limits apply to each human separately.Accounts would also have absolute core limits.


Thank you for your thoughts!

Corey

--
Corey Keasling
Software Manager
JILA Computing Group
University of Colorado-Boulder
440 UCB Room S244
Boulder, CO 80309-0440
303-492-9643

[slurm-dev] Thoughts on GrpCPURunMins as primary constraint?

Reply via email to