On 5/11/21 4:47 am, Diego Zuccato wrote:
How can Slurm detect such an old HWLOC version?
Looking at the code it's not actually checking the hwloc version, it's
finding an error condition and suggesting that may be the cause, but it
sounds like it's not for you.
src/plugins/task/cgroup/task
Hi Ole.
I'm using the packages from Debian stable (slurm 20.11.4, hwloc 2.4.1).
And I checked: hwloc is installed on all the nodes. Quite obvious since
it's a dep for slurmd:
https://packages.debian.org/bullseye/slurmd
Being a dep, i "suspect" slurmd is built with hwloc support.
Diego
Il 07/1
Hi Diego,
Are you sure that the Slurm software installed on all compute nodes was
actually built on a system which had the hwloc packages installed? They
should also be installed on the compute nodes. The prerequisite
packages are listed here:
https://wiki.fysik.dtu.dk/niflheim/Slurm_instal
They aren't using modules so it must be something system-wide :(
But not all jobs are impacted. And it seems it's a bit random (doesn't
happen always).
I'm out of ideas, currently :(
Il 05/11/2021 13:10, Ole Holm Nielsen ha scritto:
On 11/5/21 12:47, Diego Zuccato wrote:
Some users are report
On 11/5/21 12:47, Diego Zuccato wrote:
Some users are reporting this error:
slurmstepd-str957-mtx-01: error: hwloc_get_obj_below_by_type() failing,
task/affinity plugin may be required to address bug fixed in HWLOC version
1.11.5
slurmstepd-str957-mtx-01: error: task[0] unable to set taskset
Hello all.
Some users are reporting this error:
slurmstepd-str957-mtx-01: error: hwloc_get_obj_below_by_type() failing,
task/affinity plugin may be required to address bug fixed in HWLOC
version 1.11.5
slurmstepd-str957-mtx-01: error: task[0] unable to set taskset '0x0'
I checked on that nod