[slurm-users] Slurm-gcp supported OS

2022-08-05 Thread Simon Gao
Hi,

Is there a plan or ETA to add RockyLinux 8 to the supported OS for slurm-gcp?

Simon



Re: [slurm-users] Changing a user's default account

2022-08-05 Thread Thomas M. Payerle
sacctmgr add/delete user basically adds/deletes a Slurm association for
that user/cluster/account.

You need to add (an association for) the user for account B before you can
change their default account to B.

You do *not* need to delete (the association for) the user with account A
if not desired; it is perfectly allowed to have an user who is associated
with more than one Slurm account --- the user can use e.g. sbatch --account
A myjob.sh to submit a job charging against account A even if default
account is account B.

If you do not wish to "lose history" as you put it but wish to prevent the
user from using account A, you can
1) add user to account B
2) change defaultaccount for user to account B
3) disable the user's access to account A with something like
sacctmgr update where user=USERNAME account=A set MaxSubmitJobs=0


On Fri, Aug 5, 2022 at 11:26 AM Chip Seraphine 
wrote:

> Thanks.  Guess adding/deleting is the way to go then – I was hoping not to
> lose user history, but alas.
>
>
> From: slurm-users  on behalf of
> "Renfro, Michael" 
> Reply-To: Slurm User Community List 
> Date: Friday, August 5, 2022 at 10:17 AM
> To: Slurm User Community List 
> Subject: [ext] Re: [slurm-users] Changing a user's default account
>
> This should work:
>
> sacctmgr add user someuser account=newaccount # adds user to new account
>
> sacctmgr modify user where user=someuser set defaultaccount=newaccount #
> change default
>
> sacctmgr remove user where user=someuser and account=oldaccount # remove
> from old account
>
> From: slurm-users  on behalf of
> Chip Seraphine 
> Date: Friday, August 5, 2022 at 9:56 AM
> To: Slurm User Community List 
> Subject: [slurm-users] Changing a user's default account
> External Email Warning
>
> This email originated from outside the university. Please use caution when
> opening attachments, clicking links, or responding to requests.
>
> 
>
> I have a user U who is in association with account A, and I want to change
> that to account B.   The obvious thing does not work:
>
> $ sacctmgr modify user where user=”U” set defaultaccount=”B”
> Can't modify because these users aren't associated with new default
> account “B”…
>
> OK, fair enough.But I can’t find a good way to meet this requirement!
>  “sacctmgr create assoc” does not seem to be a thing.   Googling around I
> see a lot of wags deleting and recreating the user in this situation, which
> I definitely do _not_ want to do.
>
> How does one change the account that a user is tied to?
>
>
> --
>
> Chip Seraphine
> Linux Admin (Grid)
>
> This e-mail and any attachments may contain information that is
> confidential and proprietary and otherwise protected from disclosure. If
> you are not the intended recipient of this e-mail, do not read, duplicate
> or redistribute it by any means. Please immediately delete it and any
> attachments and notify the sender that you have received it by mistake.
> Unintended recipients are prohibited from taking action on the basis of
> information in this e-mail or any attachments. The DRW Companies make no
> representations that this e-mail or any attachments are free of computer
> viruses or other defects.
>


-- 
Tom Payerle
DIT-ACIGS/Mid-Atlantic Crossroadspaye...@umd.edu
5825 University Research Park   (301) 405-6135
University of Maryland
College Park, MD 20740-3831


Re: [slurm-users] Changing a user's default account

2022-08-05 Thread Tina Friedrich

I'm not sure what you mean by loosing user history?

You did way you want to change the users association from account 'A' to 
account 'B' - well, yes, that means associating the user with A (i.e. 
adding them to account A), and removing them from B.


The removing from B is optional of course - users can be associated with 
multiple accounts. Does mean that they can run jobs under multiple 
accounts as well of course! So if you only wanted to change their 
default account - add them to A and make A the default, and you're done. 
Only if you want to prevent them from running jobs within account 'B' in 
the future, you'll need to remove them from B.


I am pretty sure sacct information for past jobs run within project B 
will not be changed to project A - so the user's job history won't be 
changed by an account move.


Tina

On 05/08/2022 16:23, Chip Seraphine wrote:

Thanks.  Guess adding/deleting is the way to go then – I was hoping not to lose 
user history, but alas.


From: slurm-users  on behalf of "Renfro, 
Michael" 
Reply-To: Slurm User Community List 
Date: Friday, August 5, 2022 at 10:17 AM
To: Slurm User Community List 
Subject: [ext] Re: [slurm-users] Changing a user's default account

This should work:

sacctmgr add user someuser account=newaccount # adds user to new account

sacctmgr modify user where user=someuser set defaultaccount=newaccount # change 
default

sacctmgr remove user where user=someuser and account=oldaccount # remove from 
old account

From: slurm-users  on behalf of Chip Seraphine 

Date: Friday, August 5, 2022 at 9:56 AM
To: Slurm User Community List 
Subject: [slurm-users] Changing a user's default account
External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.



I have a user U who is in association with account A, and I want to change that 
to account B.   The obvious thing does not work:

$ sacctmgr modify user where user=”U” set defaultaccount=”B”
Can't modify because these users aren't associated with new default account “B”…

OK, fair enough.But I can’t find a good way to meet this requirement!   
“sacctmgr create assoc” does not seem to be a thing.   Googling around I see a 
lot of wags deleting and recreating the user in this situation, which I 
definitely do _not_ want to do.

How does one change the account that a user is tied to?


--

Chip Seraphine
Linux Admin (Grid)

This e-mail and any attachments may contain information that is confidential 
and proprietary and otherwise protected from disclosure. If you are not the 
intended recipient of this e-mail, do not read, duplicate or redistribute it by 
any means. Please immediately delete it and any attachments and notify the 
sender that you have received it by mistake. Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail or any 
attachments. The DRW Companies make no representations that this e-mail or any 
attachments are free of computer viruses or other defects.


--
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk



Re: [slurm-users] Changing a user's default account

2022-08-05 Thread Joseph Francisco Guzman
Hi Chip,

You don't have to delete the user, because a user can be in multiple accounts. 
Here's what I'd do:


  1.  add the user to account B
  2.  make account B their default
  3.  remove them from account A

We often swap out user accounts like this.

Best,

Joseph



Joseph F. Guzman - ITS (HPC)

Northern Arizona University

joseph.f.guz...@nau.edu


From: slurm-users  on behalf of Chip 
Seraphine 
Sent: Friday, August 5, 2022 8:23:09 AM
To: Slurm User Community List
Subject: Re: [slurm-users] Changing a user's default account

Thanks.  Guess adding/deleting is the way to go then – I was hoping not to lose 
user history, but alas.


From: slurm-users  on behalf of "Renfro, 
Michael" 
Reply-To: Slurm User Community List 
Date: Friday, August 5, 2022 at 10:17 AM
To: Slurm User Community List 
Subject: [ext] Re: [slurm-users] Changing a user's default account

This should work:

sacctmgr add user someuser account=newaccount # adds user to new account

sacctmgr modify user where user=someuser set defaultaccount=newaccount # change 
default

sacctmgr remove user where user=someuser and account=oldaccount # remove from 
old account

From: slurm-users  on behalf of Chip 
Seraphine 
Date: Friday, August 5, 2022 at 9:56 AM
To: Slurm User Community List 
Subject: [slurm-users] Changing a user's default account
External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.



I have a user U who is in association with account A, and I want to change that 
to account B.   The obvious thing does not work:

$ sacctmgr modify user where user=”U” set defaultaccount=”B”
Can't modify because these users aren't associated with new default account “B”…

OK, fair enough.But I can’t find a good way to meet this requirement!   
“sacctmgr create assoc” does not seem to be a thing.   Googling around I see a 
lot of wags deleting and recreating the user in this situation, which I 
definitely do _not_ want to do.

How does one change the account that a user is tied to?


--

Chip Seraphine
Linux Admin (Grid)

This e-mail and any attachments may contain information that is confidential 
and proprietary and otherwise protected from disclosure. If you are not the 
intended recipient of this e-mail, do not read, duplicate or redistribute it by 
any means. Please immediately delete it and any attachments and notify the 
sender that you have received it by mistake. Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail or any 
attachments. The DRW Companies make no representations that this e-mail or any 
attachments are free of computer viruses or other defects.


Re: [slurm-users] Changing a user's default account

2022-08-05 Thread Chip Seraphine
Thanks.  Guess adding/deleting is the way to go then – I was hoping not to lose 
user history, but alas.


From: slurm-users  on behalf of "Renfro, 
Michael" 
Reply-To: Slurm User Community List 
Date: Friday, August 5, 2022 at 10:17 AM
To: Slurm User Community List 
Subject: [ext] Re: [slurm-users] Changing a user's default account

This should work:

sacctmgr add user someuser account=newaccount # adds user to new account

sacctmgr modify user where user=someuser set defaultaccount=newaccount # change 
default

sacctmgr remove user where user=someuser and account=oldaccount # remove from 
old account

From: slurm-users  on behalf of Chip 
Seraphine 
Date: Friday, August 5, 2022 at 9:56 AM
To: Slurm User Community List 
Subject: [slurm-users] Changing a user's default account
External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.



I have a user U who is in association with account A, and I want to change that 
to account B.   The obvious thing does not work:

$ sacctmgr modify user where user=”U” set defaultaccount=”B”
Can't modify because these users aren't associated with new default account “B”…

OK, fair enough.But I can’t find a good way to meet this requirement!   
“sacctmgr create assoc” does not seem to be a thing.   Googling around I see a 
lot of wags deleting and recreating the user in this situation, which I 
definitely do _not_ want to do.

How does one change the account that a user is tied to?


--

Chip Seraphine
Linux Admin (Grid)

This e-mail and any attachments may contain information that is confidential 
and proprietary and otherwise protected from disclosure. If you are not the 
intended recipient of this e-mail, do not read, duplicate or redistribute it by 
any means. Please immediately delete it and any attachments and notify the 
sender that you have received it by mistake. Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail or any 
attachments. The DRW Companies make no representations that this e-mail or any 
attachments are free of computer viruses or other defects.


Re: [slurm-users] Changing a user's default account

2022-08-05 Thread Renfro, Michael
This should work:

sacctmgr add user someuser account=newaccount # adds user to new account

sacctmgr modify user where user=someuser set defaultaccount=newaccount # change 
default

sacctmgr remove user where user=someuser and account=oldaccount # remove from 
old account

From: slurm-users  on behalf of Chip 
Seraphine 
Date: Friday, August 5, 2022 at 9:56 AM
To: Slurm User Community List 
Subject: [slurm-users] Changing a user's default account
External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.



I have a user U who is in association with account A, and I want to change that 
to account B.   The obvious thing does not work:

$ sacctmgr modify user where user=”U” set defaultaccount=”B”
Can't modify because these users aren't associated with new default account “B”…

OK, fair enough.But I can’t find a good way to meet this requirement!   
“sacctmgr create assoc” does not seem to be a thing.   Googling around I see a 
lot of wags deleting and recreating the user in this situation, which I 
definitely do _not_ want to do.

How does one change the account that a user is tied to?


--

Chip Seraphine
Linux Admin (Grid)

This e-mail and any attachments may contain information that is confidential 
and proprietary and otherwise protected from disclosure. If you are not the 
intended recipient of this e-mail, do not read, duplicate or redistribute it by 
any means. Please immediately delete it and any attachments and notify the 
sender that you have received it by mistake. Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail or any 
attachments. The DRW Companies make no representations that this e-mail or any 
attachments are free of computer viruses or other defects.


[slurm-users] Changing a user's default account

2022-08-05 Thread Chip Seraphine
I have a user U who is in association with account A, and I want to change that 
to account B.   The obvious thing does not work:

$ sacctmgr modify user where user=”U” set defaultaccount=”B”
Can't modify because these users aren't associated with new default account “B”…

OK, fair enough.But I can’t find a good way to meet this requirement!   
“sacctmgr create assoc” does not seem to be a thing.   Googling around I see a 
lot of wags deleting and recreating the user in this situation, which I 
definitely do _not_ want to do.

How does one change the account that a user is tied to?


--

Chip Seraphine
Linux Admin (Grid)

This e-mail and any attachments may contain information that is confidential 
and proprietary and otherwise protected from disclosure. If you are not the 
intended recipient of this e-mail, do not read, duplicate or redistribute it by 
any means. Please immediately delete it and any attachments and notify the 
sender that you have received it by mistake. Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail or any 
attachments. The DRW Companies make no representations that this e-mail or any 
attachments are free of computer viruses or other defects.


[slurm-users] SLURM 22.05 and NHC in prolog/epilog

2022-08-05 Thread Bas van der Vlies
We are testing slurm 22.05 and we noticed a behaviour change for 
prolog/epilog scripts. We use NHC in the prolog/epilog to check if a 
node is healthy. In the prevous problems we had no problems 21.08.X and 
earlier.


Now when we do a srun:
 *  srun -t 1 hostname
```
srun: job 3975 queued and waiting for resources
srun: job 3975 has been allocated resources
srun: error: Nodes r16n19 are still not ready
srun: error: Something is wrong with the boot of the nodes.

```


11:57 r16n19:/tmp
root# ps -eaf | grep nhc
root   8   22185  0 Aug03 pts/300:00:00 tail -f nhc.log
root   50250   20274  0 11:57 ?00:00:00 [nhc] 
root   50259   1  0 11:57 ?00:00:00 /bin/bash 
/usr/sbin/nhc -f FORCE_SETSID=0
root   50268   1  0 11:57 ?00:00:00 /bin/bash 
/usr/sbin/nhc -f FORCE_SETSID=0

root   50331   48699  0 11:57 pts/500:00:00 grep --color=auto nhc

11:57 r16n19:/tmp
root# ps -eaf | grep 20274
root   20274   1  0 Aug03 ?00:00:01 
/opt/slurm/sw/current/sbin/slurmd -D

root   50250   20274  0 11:57 ?00:00:00 [nhc] 
root   50339   48699  0 11:57 pts/500:00:00 grep --color=auto 20274


Have other sites also have this problem? Did I miss an option?

Regards


--
--
Bas van der Vlies
| High Performance Computing & Visualization | SURF| Science Park 140 | 
1098 XG  Amsterdam

| T +31 (0) 20 800 1300  | bas.vandervl...@surf.nl | www.surf.nl |



Re: [slurm-users] Rolling reboot with at most N machines down simultaneously?

2022-08-05 Thread Corentin Mercier
Hello, 

I think you could use SLURM's power saving mecanism to shut down all your nodes 
simultaneously. 
Then doing srun -N -C  true (or any other small 
work) will wake up N nodes simultaneously. 
You can even do srun while your nodes are powering down, SLURM will reboot them 
as soon as they're powered down. 

I hope it will be helpful ! 

Regards, 
C.Mercier 


Re: [slurm-users] Rolling reboot with at most N machines down simultaneously?

2022-08-05 Thread Chris Samuel

On 3/8/22 10:20 pm, Gerhard Strangar wrote:


With a fake license called reboot?


It's a neat idea, but I think there is a catch:

* 3 jobs start, each taking 1 license
* Other reboot jobs are all blocked
* Running reboot jobs trigger node reboot
* Running reboot jobs end when either the script exits and slurmd cleans 
it up before the reboot kills it, or it gets killed as NODE_FAIL when 
the node has been unresponsive for too long and is marked as down

* Licenses for those jobs are released
* 3 more reboot jobs start whilst the original 3 are rebooting
* 6 nodes are now rebooting
* Filesystem fall down go boom
* Also your rebooted nodes are now drained as "Node unexpectedly rebooted"

I guess you could change your Slurm config to not mark nodes as down if 
they stop responding and make sure the job that's launched, but that 
feels wrong to me.


All the best,
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA