Re: [slurm-users] slurm power save question

2023-11-29 Thread Davide DelVento
Thanks and no worries for the time it took to reply.

Sounds good then, and it's consistent with what the documentation says,
namely "prevent those nodes from being powered down". As you said "keep
that number of nodes up" is a different thing, and yes, it would be nice to
have.
For that purpose, I'm looking at my logs of workload and mulling if I
should make a cron job (submitting dummy slurm jobs) to force slurm
bringing nodes up if not enough idle ones are up, to reduce wait in queue
for users jobs.

Thanks again

On Wed, Nov 29, 2023 at 8:43 AM Brian Andrus  wrote:

> Sorry for the late reply.
>
> For my site, I used the optional ":" separator to ensure at least 4 nodes
> were up. Eg: nid[10-20]:4
> This means at least 4 nodes.. those nodes do not have to be the same 4 at
> any time, so if one is down that used to be idle, but 4 are up, that 1 will
> not be brought back up. I don't see this setting having much of anything to
> do with bringing nodes up at all with the exception of when you first start
> slurmctld and the settings are not met. Once there are jobs running on any
> of the listed nodes, they count toward the number. That is my experience
> with the small numbers I used. YMMV.
>
> I have also explicitly stated nodes without the separator, which does
> work. I do that when I am trying to look at a node that is idle without a
> job on it. That stops slurm from shutting it down while I am looking at it.
>
> Although, I do agree, the functionality of being able to have "keep at
> least X nodes up and idle" would be nice, that is not how I see this
> documented or working.
>
> Brian Andrus
> On 11/23/2023 5:12 AM, Davide DelVento wrote:
>
> Thanks for confirming, Brian. That was my understanding as well. Do you
> have it working that way on a machine you have access to?  If so, I'd be
> interested to see the config file, because that's not the behavior I am
> experiencing in my tests.
> In fact, in my tests Slurm will not bring down those "X nodes" but will
> not bring them up either, *unless* there is a job targeted to those. I may
> have something misconfigured, and I'd love to fix that.
>
> Thanks!
>
> On Wed, Nov 22, 2023 at 5:46 PM Brian Andrus  wrote:
>
>> As I understand it, that setting means "Always have at least X nodes up",
>> which includes running jobs. So it stops any wait time for the first X jobs
>> being submitted, but any jobs after that will need to wait for the power_up
>> sequence.
>>
>> Brian Andrus
>> On 11/22/2023 6:58 AM, Davide DelVento wrote:
>>
>> I've started playing with powersave and have a question about
>> SuspendExcNodes. The documentation at
>> https://slurm.schedmd.com/power_save.html says
>>
>> For example nid[10-20]:4 will prevent 4 usable nodes (i.e IDLE and not
>> DOWN, DRAINING or already powered down) in the set nid[10-20] from being
>> powered down.
>>
>> I initially interpreted that as "Slurm will try to keep 4 nodes idle on
>> as much as possible", which would have reduced the wait time for new jobs
>> targeting those nodes. Instead, it appears to mean "Slurm will not shut off
>> the last 4 nodes which are idle in that partition, however it will not turn
>> on nodes which it shut off earlier unless jobs are scheduled on them"
>>
>> Most notably if the 4 idle nodes will be allocated to other jobs (and so
>> they are no idle anymore) slurm does not turn on any nodes which have been
>> shut off earlier, so it's possible (and depending on workloads perhaps even
>> common) to have no idle nodes on regardless of the SuspendExcNode settings.
>>
>> Is that how it works, or do I have anything else in my setting which is
>> causing this unexpected-to-me behavior? I think I can live with it, but
>> IMHO it would have been better if slurm attempted to turn on nodes
>> preemptively trying to match the requested SuspendExcNodes, rather than
>> waiting for job submissions.
>>
>> Thanks and Happy Thanksgiving to people in the USA
>>
>>


Re: [slurm-users] slurm power save question

2023-11-29 Thread Brian Andrus

Sorry for the late reply.

For my site, I used the optional ":" separator to ensure at least 4 
nodes were up. Eg: nid[10-20]:4
This means at least 4 nodes.. those nodes do not have to be the same 4 
at any time, so if one is down that used to be idle, but 4 are up, that 
1 will not be brought back up. I don't see this setting having much of 
anything to do with bringing nodes up at all with the exception of when 
you first start slurmctld and the settings are not met. Once there are 
jobs running on any of the listed nodes, they count toward the number. 
That is my experience with the small numbers I used. YMMV.


I have also explicitly stated nodes without the separator, which does 
work. I do that when I am trying to look at a node that is idle without 
a job on it. That stops slurm from shutting it down while I am looking 
at it.


Although, I do agree, the functionality of being able to have "keep at 
least X nodes up and idle" would be nice, that is not how I see this 
documented or working.


Brian Andrus

On 11/23/2023 5:12 AM, Davide DelVento wrote:
Thanks for confirming, Brian. That was my understanding as well. Do 
you have it working that way on a machine you have access to?  If so, 
I'd be interested to see the config file, because that's not the 
behavior I am experiencing in my tests.
In fact, in my tests Slurm will not bring down those "X nodes" but 
will not bring them up either, *unless* there is a job targeted to 
those. I may have something misconfigured, and I'd love to fix that.


Thanks!

On Wed, Nov 22, 2023 at 5:46 PM Brian Andrus  wrote:

As I understand it, that setting means "Always have at least X
nodes up", which includes running jobs. So it stops any wait time
for the first X jobs being submitted, but any jobs after that will
need to wait for the power_up sequence.

Brian Andrus

On 11/22/2023 6:58 AM, Davide DelVento wrote:

I've started playing with powersave and have a question about
SuspendExcNodes. The documentation at
https://slurm.schedmd.com/power_save.html says

For example |nid[10-20]:4| will prevent 4 usable nodes (i.e IDLE
and not DOWN, DRAINING or already powered down) in the set
|nid[10-20]| from being powered down.

I initially interpreted that as "Slurm will try to keep 4 nodes
idle on as much as possible", which would have reduced the wait
time for new jobs targeting those nodes. Instead, it appears to
mean "Slurm will not shut off the last 4 nodes which are idle in
that partition, however it will not turn on nodes which it shut
off earlier unless jobs are scheduled on them"

Most notably if the 4 idle nodes will be allocated to other jobs
(and so they are no idle anymore) slurm does not turn on any
nodes which have been shut off earlier, so it's possible (and
depending on workloads perhaps even common) to have no idle nodes
on regardless of the SuspendExcNode settings.

Is that how it works, or do I have anything else in my setting
which is causing this unexpected-to-me behavior? I think I can
live with it, but IMHO it would have been better if slurm
attempted to turn on nodes preemptively trying to match the
requested SuspendExcNodes, rather than waiting for job submissions.

Thanks and Happy Thanksgiving to people in the USA


Re: [slurm-users] slurm power save question

2023-11-23 Thread Davide DelVento
Thanks for confirming, Brian. That was my understanding as well. Do you
have it working that way on a machine you have access to?  If so, I'd be
interested to see the config file, because that's not the behavior I am
experiencing in my tests.
In fact, in my tests Slurm will not bring down those "X nodes" but will not
bring them up either, *unless* there is a job targeted to those. I may have
something misconfigured, and I'd love to fix that.

Thanks!

On Wed, Nov 22, 2023 at 5:46 PM Brian Andrus  wrote:

> As I understand it, that setting means "Always have at least X nodes up",
> which includes running jobs. So it stops any wait time for the first X jobs
> being submitted, but any jobs after that will need to wait for the power_up
> sequence.
>
> Brian Andrus
> On 11/22/2023 6:58 AM, Davide DelVento wrote:
>
> I've started playing with powersave and have a question about
> SuspendExcNodes. The documentation at
> https://slurm.schedmd.com/power_save.html says
>
> For example nid[10-20]:4 will prevent 4 usable nodes (i.e IDLE and not
> DOWN, DRAINING or already powered down) in the set nid[10-20] from being
> powered down.
>
> I initially interpreted that as "Slurm will try to keep 4 nodes idle on as
> much as possible", which would have reduced the wait time for new jobs
> targeting those nodes. Instead, it appears to mean "Slurm will not shut off
> the last 4 nodes which are idle in that partition, however it will not turn
> on nodes which it shut off earlier unless jobs are scheduled on them"
>
> Most notably if the 4 idle nodes will be allocated to other jobs (and so
> they are no idle anymore) slurm does not turn on any nodes which have been
> shut off earlier, so it's possible (and depending on workloads perhaps even
> common) to have no idle nodes on regardless of the SuspendExcNode settings.
>
> Is that how it works, or do I have anything else in my setting which is
> causing this unexpected-to-me behavior? I think I can live with it, but
> IMHO it would have been better if slurm attempted to turn on nodes
> preemptively trying to match the requested SuspendExcNodes, rather than
> waiting for job submissions.
>
> Thanks and Happy Thanksgiving to people in the USA
>
>


Re: [slurm-users] slurm power save question

2023-11-22 Thread Brian Andrus
As I understand it, that setting means "Always have at least X nodes 
up", which includes running jobs. So it stops any wait time for the 
first X jobs being submitted, but any jobs after that will need to wait 
for the power_up sequence.


Brian Andrus

On 11/22/2023 6:58 AM, Davide DelVento wrote:
I've started playing with powersave and have a question about 
SuspendExcNodes. The documentation at 
https://slurm.schedmd.com/power_save.html says


For example |nid[10-20]:4| will prevent 4 usable nodes (i.e IDLE and 
not DOWN, DRAINING or already powered down) in the set 
|nid[10-20]| from being powered down.


I initially interpreted that as "Slurm will try to keep 4 nodes idle 
on as much as possible", which would have reduced the wait time for 
new jobs targeting those nodes. Instead, it appears to mean "Slurm 
will not shut off the last 4 nodes which are idle in that partition, 
however it will not turn on nodes which it shut off earlier unless 
jobs are scheduled on them"


Most notably if the 4 idle nodes will be allocated to other jobs (and 
so they are no idle anymore) slurm does not turn on any nodes which 
have been shut off earlier, so it's possible (and depending on 
workloads perhaps even common) to have no idle nodes on regardless of 
the SuspendExcNode settings.


Is that how it works, or do I have anything else in my setting which 
is causing this unexpected-to-me behavior? I think I can live with it, 
but IMHO it would have been better if slurm attempted to turn on nodes 
preemptively trying to match the requested SuspendExcNodes, rather 
than waiting for job submissions.


Thanks and Happy Thanksgiving to people in the USA

[slurm-users] slurm power save question

2023-11-22 Thread Davide DelVento
I've started playing with powersave and have a question about
SuspendExcNodes. The documentation at
https://slurm.schedmd.com/power_save.html says

For example nid[10-20]:4 will prevent 4 usable nodes (i.e IDLE and not
DOWN, DRAINING or already powered down) in the set nid[10-20] from being
powered down.

I initially interpreted that as "Slurm will try to keep 4 nodes idle on as
much as possible", which would have reduced the wait time for new jobs
targeting those nodes. Instead, it appears to mean "Slurm will not shut off
the last 4 nodes which are idle in that partition, however it will not turn
on nodes which it shut off earlier unless jobs are scheduled on them"

Most notably if the 4 idle nodes will be allocated to other jobs (and so
they are no idle anymore) slurm does not turn on any nodes which have been
shut off earlier, so it's possible (and depending on workloads perhaps even
common) to have no idle nodes on regardless of the SuspendExcNode settings.

Is that how it works, or do I have anything else in my setting which is
causing this unexpected-to-me behavior? I think I can live with it, but
IMHO it would have been better if slurm attempted to turn on nodes
preemptively trying to match the requested SuspendExcNodes, rather than
waiting for job submissions.

Thanks and Happy Thanksgiving to people in the USA