Hi Xaver,
we also had a similar problem with Slurm 21.08 (see thread "error: power_save
module disabled, NULL SuspendProgram").
Fortunately, we have not yet observed this since the upgrade to 23.02. But the
time period (about a month) is still too short to know if the problem is
really fixed
Hi Ole,
for multiple reasons we build it ourself, but I am not really involved
in that process, but I will contact the person who is. Thanks for the
recommendation! We should probably implement a regular check whether
there is a new slurm version. I am not 100% whether this will fix our
issues or
On 12/6/23 11:51, Xaver Stiensmeier wrote:
Good idea. Here's our current version:
```
sinfo -V
slurm 22.05.7
```
Quick googling told me that the latest version is 23.11. Does the
upgrade change anything in that regard? I will keep reading.
There are nice bug fixes in 23.02 mentioned in my SLU
Hi Ole,
Good idea. Here's our current version:
```
sinfo -V
slurm 22.05.7
```
Quick googling told me that the latest version is 23.11. Does the
upgrade change anything in that regard? I will keep reading.
Xaver
On 06.12.23 11:09, Ole Holm Nielsen wrote:
Hi Xaver,
Your version of Slurm may m
Hi Xaver,
Your version of Slurm may matter for your power saving experience. Do you
run an updated version?
/Ole
On 12/6/23 10:54, Xaver Stiensmeier wrote:
Hi Ole,
I will double check, but I am very sure that giving a reason is possible
as it has been done at least 20 other times without e
Hi Ole,
I will double check, but I am very sure that giving a reason is possible
as it has been done at least 20 other times without error during that
exact run. It might be ignored though. You can also give a reason when
defining the states POWER_UP and POWER_DOWN. Slurm's documentation is
not a
Hi Xavier,
On 12/6/23 09:28, Xaver Stiensmeier wrote:
using https://slurm.schedmd.com/power_save.html we had one case out of
many (>242) node starts that resulted in
|slurm_update error: Invalid node state specified|
when we called:
|scontrol update NodeName="$1" state=RESUME reason=FailedSt
Dear Slurm User list,
using https://slurm.schedmd.com/power_save.html we had one case out of
many (>242) node starts that resulted in
|slurm_update error: Invalid node state specified|
when we called:
|scontrol update NodeName="$1" state=RESUME reason=FailedStartup|
in the Fail script. We run