Problem might be that OverSubscribe is not enabled? w/o it, I don't believe the time-slicing can be GANG scheduled
Can you do a "scontrol show partition" to verify that it is? On Thu, Jan 12, 2023 at 6:24 PM Helder Daniel <hdan...@ualg.pt> wrote: > Hi, > > I am trying to enable gang scheduling on a server with a CPU with 32 cores > and 4 GPUs. > > However, using Gang sched, the cpu jobs (or gpu jobs) are not being > preempted after the time slice, which is set to 30 secs. > > Below is a snapshot of squeue. There are 3 jobs each needing 32 cores. The > first 2 jobs launched are never preempted. The 3rd job is forever (or at > least until one of the other 2 ends) starving: > > JOBID PARTITION NAME USER ST TIME NODES > NODELIST(REASON) > 313 asimov01 cpu-only hdaniel PD 0:00 1 > (Resources) > 311 asimov01 cpu-only hdaniel R 1:52 1 asimov > 312 asimov01 cpu-only hdaniel R 1:49 1 asimov > > The same happens with GPU jobs. If I launch 5 jobs, requiring one GPU > each, the 5th job will never run. The preemption is not working with the > specified timeslice. > > I tried several combinations: > > SchedulerType=sched/builtin and backfill > SelectType=select/cons_tres and linear > > I'll appreciate any help and suggestions > The slurm.conf is below. > Thanks > > ClusterName=asimov > SlurmctldHost=localhost > MpiDefault=none > ProctrackType=proctrack/linuxproc # proctrack/cgroup > ReturnToService=2 > SlurmctldPidFile=/var/run/slurmctld.pid > SlurmctldPort=6817 > SlurmdPidFile=/var/run/slurmd.pid > SlurmdPort=6818 > SlurmdSpoolDir=/var/lib/slurm/slurmd > SlurmUser=slurm > StateSaveLocation=/var/lib/slurm/slurmctld > SwitchType=switch/none > TaskPlugin=task/none # task/cgroup > # > # TIMERS > InactiveLimit=0 > KillWait=30 > MinJobAge=300 > SlurmctldTimeout=120 > SlurmdTimeout=300 > Waittime=0 > # > # SCHEDULING > #FastSchedule=1 #obsolete > SchedulerType=sched/builtin #backfill > SelectType=select/cons_tres > SelectTypeParameters=CR_Core #CR_Core_Memory let's only one job run at > a time > PreemptType = preempt/partition_prio > PreemptMode = SUSPEND,GANG > SchedulerTimeSlice=30 #in seconds, default 30 > # > # LOGGING AND ACCOUNTING > #AccountingStoragePort= > AccountingStorageType=accounting_storage/none > #AccountingStorageEnforce=associations > #ClusterName=bip-cluster > JobAcctGatherFrequency=30 > JobAcctGatherType=jobacct_gather/linux > SlurmctldDebug=info > SlurmctldLogFile=/var/log/slurm/slurmctld.log > SlurmdDebug=info > SlurmdLogFile=/var/log/slurm/slurmd.log > # > # > # COMPUTE NODES > #NodeName=asimov CPUs=64 RealMemory=500 State=UNKNOWN > #PartitionName=LocalQ Nodes=ALL Default=YES MaxTime=INFINITE State=UP > > # Partitions > GresTypes=gpu > NodeName=asimov Gres=gpu:4 Sockets=1 CoresPerSocket=32 ThreadsPerCore=2 > State=UNKNOWN > PartitionName=asimov01 Nodes=asimov Default=YES MaxTime=INFINITE > MaxNodes=1 DefCpuPerGPU=2 State=UP > >