On 2024/07/10 16:25, jack.mellor--- via slurm-users wrote:
We are running slurm 23.02.6.
Our nodes have hyperthreading disabled and we have slurm.conf
set to CPU=32 for each node (each node has 2 processes with 16 cores).
When we allocated a job, such as salloc -n 32, it will allocate
a whole node but using sinfo shows double the allocation in the TRES=64.
It also shows in sinfo that the node has 4294967264 idle CPUs.
What does an
scontrol show node
tell you about the node(s)
On our systems, where, sadly, our vendor is unable/unwilling
to turn off SMT/hyperthreading, we see (not all fields shown),
for a fully allocated, AMD EPYC 7763: so 128 physical core, node
CoresPerSocket=64
CPUAlloc=256 CPUEfctv=256 CPUTot=256
Sockets=2 Boards=1
ThreadsPerCore=2
CfgTRES=cpu=256
AllocTRES=cpu=256
so I guess the question would be,
depending on exactly what you see,
have you explictly set, or tried setting,
ThreadsPerCore=1
in the config.
--
Supercomputing Systems Administrator
Pawsey Supercomputing Centre
SMS: +61 4 7497 6266
Eml: kevin.buck...@pawsey.org.au
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com