Durai Arasan <arasan.du...@gmail.com> writes:

> Is there a way of improving this situation? E.g. by not blocking IDLE nodes
> with jobs that only use a fraction of the 8 GPUs? Why are single GPU jobs
> not scheduled to fill already MIXED nodes before using IDLE ones?
>
> What parameters/configuration need to be adjusted for this to be enforced?

There are two SchedulerParameters you could experiment with (from man 
slurm.conf):

   bf_busy_nodes
          When  selecting resources for pending jobs to reserve for future 
execution (i.e. the job can not be started immediately), then pref‐
          erentially select nodes that are in use.  This will tend to leave 
currently idle resources available for backfilling longer  running
          jobs,  but  may  result  in  allocations  having less than optimal 
network topology.  This option is currently only supported by the
          select/cons_res  and  select/cons_tres  plugins  (or  
select/cray_aries  with  SelectTypeParameters  set  to   "OTHER_CONS_RES"   or
          "OTHER_CONS_TRES", which layers the select/cray_aries plugin over the 
select/cons_res or select/cons_tres plugin respectively).

   pack_serial_at_end
          If used with the select/cons_res or select/cons_tres plugin, then put 
serial jobs at the end of  the  available  nodes  rather  than
          using a best fit algorithm.  This may reduce resource fragmentation 
for some workloads.

-- 
B/H

Attachment: signature.asc
Description: PGP signature

Reply via email to