Re: [slurm-users] slurm power save question

2023-11-29 Thread Davide DelVento
Thanks and no worries for the time it took to reply. Sounds good then, and it's consistent with what the documentation says, namely "prevent those nodes from being powered down". As you said "keep that number of nodes up" is a different thing, and yes, it would be nice to have. For that purpose,

Re: [slurm-users] slurm power save question

2023-11-29 Thread Brian Andrus
Sorry for the late reply. For my site, I used the optional ":" separator to ensure at least 4 nodes were up. Eg: nid[10-20]:4 This means at least 4 nodes.. those nodes do not have to be the same 4 at any time, so if one is down that used to be idle, but 4 are up, that 1 will not be brought

Re: [slurm-users] slurm power save question

2023-11-23 Thread Davide DelVento
Thanks for confirming, Brian. That was my understanding as well. Do you have it working that way on a machine you have access to? If so, I'd be interested to see the config file, because that's not the behavior I am experiencing in my tests. In fact, in my tests Slurm will not bring down those "X

Re: [slurm-users] slurm power save question

2023-11-22 Thread Brian Andrus
As I understand it, that setting means "Always have at least X nodes up", which includes running jobs. So it stops any wait time for the first X jobs being submitted, but any jobs after that will need to wait for the power_up sequence. Brian Andrus On 11/22/2023 6:58 AM, Davide DelVento

[slurm-users] slurm power save question

2023-11-22 Thread Davide DelVento
I've started playing with powersave and have a question about SuspendExcNodes. The documentation at https://slurm.schedmd.com/power_save.html says For example nid[10-20]:4 will prevent 4 usable nodes (i.e IDLE and not DOWN, DRAINING or already powered down) in the set nid[10-20] from being