On 2024/07/10 16:25, jack.mellor--- via slurm-users wrote:

We are running slurm 23.02.6.
Our nodes have hyperthreading disabled and we have slurm.conf
set to CPU=32 for each node (each node has 2 processes with 16 cores).
When we allocated a job, such as salloc -n 32, it will allocate
a whole node but using sinfo shows double the allocation in the TRES=64.
It also shows in sinfo that the node has 4294967264 idle CPUs.

What does an

  scontrol show node

tell you about the node(s)

On our systems, where, sadly, our vendor is unable/unwilling
to turn off SMT/hyperthreading, we see (not all fields shown),
for a fully allocated, AMD EPYC 7763: so 128 physical core, node

 CoresPerSocket=64
CPUAlloc=256 CPUEfctv=256 CPUTot=256

 Sockets=2 Boards=1

 ThreadsPerCore=2

   CfgTRES=cpu=256
   AllocTRES=cpu=256

so I guess the question would be,
depending on exactly what you see,

have you explictly set, or tried setting,

 ThreadsPerCore=1

in the config.

















--
Supercomputing Systems Administrator
Pawsey Supercomputing Centre
SMS: +61 4 7497 6266
Eml: kevin.buck...@pawsey.org.au


--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to