Okay, I verified the MultipleFiles approach on a testing slurm install
with 1 control computer and two nodes and it works (with
ConstrainDevices=yes)!
Name=gpu Type=3090
MultipleFiles=/dev/nvidia0,/dev/dri/card1,/dev/dri/renderD128
Name=gpu Type=3090
MultipleFiles=/dev/nvidia1,/dev/dr
David,
There are several possible answers depending on what you hope to
accomplish. What exactly is the issue that you're trying to solve? Do
you mean that you have users who need, say, 8 GB of RAM per core but you
only have 4 GB of RAM per core on the system and you want a way to
account fo
Also I recommend setting:
*CoreSpecCount*
Number of cores reserved for system use. These cores will not be
available for allocation to user jobs. Depending upon the
*TaskPluginParam* option of *SlurmdOffSpec*, Slurm daemons (i.e.
slurmd and slurmstepd) may either be confined to these
You can actually spoof the number of cores and RAM on a node by using
the config_override option. I've used that before for testing
purposes. Mind you core binding and other features like that will not
work if you start spoofing the number of cores and ram, so use with caution.
-Paul Edmon-
Maybe I have good news, Stephan (and others). I discovered SLURM 20.11
added a MultipleFiles option to gres.conf, which replaces File=. There
are no docs about it yet, but I found a (possibly) working snippet
making use of this option here:
https://bugs.schedmd.com/show_bug.cgi?id=11091#c13 .