In another thread, On 26-01-2021 17:44, Prentice Bisbal wrote:
Personally, I think it's good that Slurm RPMs are now available through
EPEL, although I won't be able to use them, and I'm sure many people on
the list won't be able to either, since licensing issues prevent them
from providing support for NVIDIA drivers, so those of us with GPUs on
our clusters will still have to compile Slurm from source to include
NVIDIA GPU support.
We're running Slurm 20.02.6 and recently added some NVIDIA GPU nodes.
The Slurm GPU documentation seems to be
https://slurm.schedmd.com/gres.html
We don't seem to have any problems scheduling jobs on GPUs, even though
our Slurm RPM build host doesn't have any NVIDIA software installed, as
shown by the command:
$ ldconfig -p | grep libnvidia-ml
I'm curious about Prentice's statement about needing NVIDIA libraries to
be installed when building Slurm RPMs, and I read the discussion in bug
9525,
https://bugs.schedmd.com/show_bug.cgi?id=9525
from which it seems that the problem was fixed in 20.02.6 and 20.11.
Question: Is there anything special that needs to be done when building
Slurm RPMs with NVIDIA GPU support?
Thanks,
Ole