You could try a QoS with
Flags=DenyOnLimit,OverPartQOS,PartitionTimeLimit Priority=<number> 

Depending on how you have your accounting set up, you could tweak some
of the GrpTRES, MaxTRES, MaxTRESPU and MaxJobsPU to try to limit
resource usage down to your 20 node limit.  I'm not sure, off the top
of my head, how to define a hard limit for max nodes that a QoS can
use.  

You could use accounting to prevent unauthorized users from submitting
to that QoS.

If the QoS isn't going to be used often, you could have only one job
running in that QoS at a time, and use job_submit.lua to set the
MaxNodes submitted at job submission to 20.

if string.match(job_desc.qos, "special") then
  job_desc.max_nodes = 20
end


Just a couple idea's for you, there's probably a way better way to do
it!

-- 
Nicholas McCollum
HPC Systems Administrator
Alabama Supercomputer Authority

On Wed, 2017-07-19 at 09:12 -0600, Steffen Grunewald wrote:
> Is it possible to define, and use, a subset of all available nodes in
> a
> partition *without* explicitly setting aside a number of nodes (in a
> static
> Nodes=... definition)?
> Let's say I want, starting with a 100-node cluster, make 20 nodes
> available
> for jobs needing an extended MaxTime and Priority (compared with the
> defaults)
>  - and if these 20 nodes have been allocated, no more nodes will be
> available
> to jobs  submitted to this particular partition, but the 20 nodes may
> cover
> a subset of all nodes changing over time (as it will not be in use
> very often)?
> 
> Can this be done with Slurm's built-in functionality, 15.08.8 or
> later?
> Any pointers are welcome...
> 
> Thanks,
>  S
> 

Reply via email to