Hi, I have a question regarding the default number of CPUs allocated per GPU (`DefCpuPerGPU` in `slurm.conf`). I first mention that the doc refers to `DefCpusPerGPU` (with an 's' at Cpu) but slurmctld only understand `DefCpuPerGPU` (c.f. https://bugs.schedmd.com/show_bug.cgi?id=7203).
So here is what the `slurm.conf` doc states: ``` DefCpusPerGPU Default count of CPUs allocated per allocated GPU ``` Here is an extract of my `slurm.conf` file setting `DefCpuPerGPU=32`: # SCHEDULING SchedulerType=sched/backfill SelectType=select/cons_tres FastSchedule=1 SelectTypeParameters=CR_CPU_Memory PriorityType=priority/multifactor PriorityFlags=CALCULATE_RUNNING,SMALL_RELATIVE_TO_TIME PriorityFavorSmall=yes DefMemPerCPU=2000 MaxMemPerCPU=2800 DefMemPerGPU=80000 DefCpuPerGPU=32 # COMPUTE NODES GresTypes=gpu NodeName=XXXX NodeAddr=XXXX Gres=gpu:rtx2080:2,gpu:gtx1080:1 Sockets=4 CoresPerSocket=16 ThreadsPerCore=2 RealMemory=376000 MemSpecLimit=10000 State=UNKNOWN PartitionName=prod Nodes=XXXX OverSubscribe=YES Default=YES MaxTime=INFINITE DefaultTime=2:0:0 State=UP ``` However, if I request a GPU, I only get 2 cores by default: ``` srun --gpus=1 --pty bash $ taskset -c -p $$ pid 127735's current affinity list: 1,65 ``` I use Slurm 19.05 on an ArchLinux machine and the `slurm-llnl` AUR package, here are my `gres.conf` file and my system info: - gres.conf ``` NodeName=XXXX Name=gpu Type=rtx2080 File=/dev/nvidia0 Cores=32-63 NodeName=XXXX Name=gpu Type=rtx2080 File=/dev/nvidia1 Cores=64-95 NodeName=XXXX Name=gpu Type=gtx1080 File=/dev/nvidia2 Cores=96-127 ``` - System info ``` $ uname -a Linux XXXX 5.1.15-arch1-1-ARCH #1 SMP PREEMPT Tue Jun 25 04:49:39 UTC 2019 x86_64 GNU/Linux ``` Thanks in advanced, Best regards, Ghislain Durif