Re: [slurm-users] schedule mixed nodes first

2021-05-17 Thread Bjørn-Helge Mevik
Durai Arasan  writes:

> Is there a way of improving this situation? E.g. by not blocking IDLE nodes
> with jobs that only use a fraction of the 8 GPUs? Why are single GPU jobs
> not scheduled to fill already MIXED nodes before using IDLE ones?
>
> What parameters/configuration need to be adjusted for this to be enforced?

There are two SchedulerParameters you could experiment with (from man 
slurm.conf):

   bf_busy_nodes
  When  selecting resources for pending jobs to reserve for future 
execution (i.e. the job can not be started immediately), then pref‐
  erentially select nodes that are in use.  This will tend to leave 
currently idle resources available for backfilling longer  running
  jobs,  but  may  result  in  allocations  having less than optimal 
network topology.  This option is currently only supported by the
  select/cons_res  and  select/cons_tres  plugins  (or  
select/cray_aries  with  SelectTypeParameters  set  to   "OTHER_CONS_RES"   or
  "OTHER_CONS_TRES", which layers the select/cray_aries plugin over the 
select/cons_res or select/cons_tres plugin respectively).

   pack_serial_at_end
  If used with the select/cons_res or select/cons_tres plugin, then put 
serial jobs at the end of  the  available  nodes  rather  than
  using a best fit algorithm.  This may reduce resource fragmentation 
for some workloads.

-- 
B/H


signature.asc
Description: PGP signature


[slurm-users] schedule mixed nodes first

2021-05-14 Thread Durai Arasan
Hi,

Frequently all of our GPU nodes (8xGPU each) are in MIXED state and there
is no IDLE node. Some jobs require a complete node (all 8 GPUs) and such
jobs therefore have to wait really long before they can run.

Is there a way of improving this situation? E.g. by not blocking IDLE nodes
with jobs that only use a fraction of the 8 GPUs? Why are single GPU jobs
not scheduled to fill already MIXED nodes before using IDLE ones?

What parameters/configuration need to be adjusted for this to be enforced?

Our current scheduling configuration:

slurm.conf:
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory

gres.conf (one node example):
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[0-3]
COREs=0-17,36-53
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[4-7]
COREs=18-35,54-71


Thank you,
Durai
Competence center for Machine Learning Tübingen