[slurm-users] What is the 'Root/Cluster association' level in Resource Limits document mean?

2022-02-07 Thread taleintervenor
Hi all,

 

According to Resource Limits page (
https://slurm.schedmd.com/resource_limits.html ), there is Root/Cluster
association level under account level to provide default limitation. But how
to check or modify this "cluster association"? Using command sacctmgr show
association, I can only list all users' association.

 

Considering the scene in which we want to set a default node number
limitation for all users, command such as sacctmgr modify user set
grptres="node=8" do can set the limitation on all users at once, but it will
cover the original per-user limitation on some specific account. So it may
not be an satisfying solution. If the "cluster association" exists, it may
be exactly what we want. So how to set the "cluster association"?



Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the cloned DB

2022-02-07 Thread Moshe Mergy
I dont know much in mysql, and neither about the interactions between Slurm 
daemons and mysql, that's why I wanted to check with the Slurm community, 
before going on in this dry-rum DB upgrade.


Thanks a lot Brian !!

Much appreciated you help !

Moshe


From: slurm-users  on behalf of Brian 
Andrus 
Sent: Monday, February 7, 2022 8:03 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the 
cloned DB


Moshe,


So it looks like you added the dummy user to the main database somehow.

I would suggest to try again being cautious and make a dummy2 user or such.
Your questions now are getting out of slurm and into mysql area, so may be more 
appropriate in another forum.


Brian Andrus




On 2/7/2022 9:36 AM, Moshe Mergy wrote:

You're right Brian,

- after shutting down slurmdbd on the cloned DB node, sshare still shows the 
dummy user !!

BUT `mysqlshow -p --status ` shows a different creation date for the 
cloned DB.


- 'localhost' is already set for the SlurmtctldHost and AccountingStorageHost 
(slurm.conf) and DbdHost (slurmdbd.conf), as local files into cloned node

I guess I've changed these values, after shutting down slurmd daemon (at the 
very beginning) on Node.

- Original slurm.conf file is a NFS shared file for all the other nodes, and 
the head machine

- slurmdbd.conf is a specific local file for Head machine too.

- BackupHost parameter is not present in slurm.conf (on node, neither on Head)






Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the cloned DB

2022-02-07 Thread Brian Andrus

Moshe,


So it looks like you added the dummy user to the main database somehow.

I would suggest to try again being cautious and make a dummy2 user or such.
Your questions now are getting out of slurm and into mysql area, so may 
be more appropriate in another forum.



Brian Andrus




On 2/7/2022 9:36 AM, Moshe Mergy wrote:


You're right Brian,

- after shutting down slurmdbd on the cloned DB node, sshare still 
shows the dummy user !!


BUT `mysqlshow -p --status ` shows a different creation 
date for the cloned DB.



- 'localhost' is already set for the SlurmtctldHost and 
AccountingStorageHost (slurm.conf) and DbdHost (slurmdbd.conf), as 
local files into cloned node


I guess I've changed these values, after shutting down slurmd daemon 
(at the very beginning) on Node.


- Original slurm.conf file is a NFS shared file for all the other 
nodes, and the head machine


- slurmdbd.conf is a specific local file for Head machine too.

- BackupHost parameter is not present in slurm.conf (on node, neither 
on Head)








Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the cloned DB

2022-02-07 Thread Moshe Mergy
You're right Brian,

- after shutting down slurmdbd on the cloned DB node, sshare still shows the 
dummy user !!

BUT `mysqlshow -p --status ` shows a different creation date for the 
cloned DB.


- 'localhost' is already set for the SlurmtctldHost and AccountingStorageHost 
(slurm.conf) and DbdHost (slurmdbd.conf), as local files into cloned node

I guess I've changed these values, after shutting down slurmd daemon (at the 
very beginning) on Node.

- Original slurm.conf file is a NFS shared file for all the other nodes, and 
the head machine

- slurmdbd.conf is a specific local file for Head machine too.

- BackupHost parameter is not present in slurm.conf (on node, neither on Head)






Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the cloned DB

2022-02-07 Thread Brian Andrus
Odd. I would bet you haven't fully isolated that node that you put the 
db clone on.


If you shut down slurmdbd on the cloned db node does sshare still show 
dummy user from the head node?



A few things you could check:
You needed to make those changes for the slurm.conf and slurmdbd.conf 
after shutting down slurmd.


You should not have anything listed as backuphost for anything on the 
cloned db node.


It should only have 'localhost' for the SlurmtctldHost and 
AccountingStorageHost (slurm.conf) and DbdHost (slurmdbd.conf)



Brian Andrus


On 2/7/2022 8:51 AM, Moshe Mergy wrote:


Yes Brian,

sshare performed on the head node or on ANY other nodes, shows the 
dummy user created on the node with the cloned DB



I wanted to test the (cloned) DB upgrade in advance, and the real 
DB upgrade a few weeks later (after the users deadlines)



Moshe

*From:* slurm-users  on behalf 
of Brian Andrus
*Subject:* Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question 
on the cloned DB


So if you run sshare on the head node, it shows your dummy user?


At any rate, just do a db dump (also known as a backup) and you can 
restore that if you have an issue of any sort.



Brian Andrus





Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the cloned DB

2022-02-07 Thread Moshe Mergy
Yes Brian,

sshare performed on the head node or on ANY other nodes, shows the dummy user 
created on the node with the cloned DB


I wanted to test the (cloned) DB upgrade in advance, and the real DB upgrade a 
few weeks later (after the users deadlines)

Moshe

From: slurm-users  on behalf of Brian 
Andrus
Subject: Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the 
cloned DB


So if you run sshare on the head node, it shows your dummy user?


At any rate, just do a db dump (also known as a backup) and you can restore 
that if you have an issue of any sort.


Brian Andrus





Re: [slurm-users] [External] What is an easy way to prevent users run programs on the master/login node.

2022-02-07 Thread Michael Robbert
They moved Arbiter2 to Github. Here is the new official repo: 
https://github.com/CHPC-UofU/arbiter2

Mike

On 2/7/22, 06:51, "slurm-users"  wrote:
Hi,

I've just noticed that the repository https://gitlab.chpc.utah.edu/arbiter2
seems is down. Does someone know more?

Thank you!

Best,
Stefan

Am Dienstag, 27. April 2021, 17:35:35 CET schrieb Prentice Bisbal:
> I think someone asked this same exact question a few weeks ago. The best
> solution I know of is to use Arbiter, which was created exactly for this
> situation. It uses cgroups to limit resource usage, but it adjusts those
> limits based on login node utilization and each users behavior ("bad"
> users get their resources limited more severely when they do "bad" things.
>
> I will be deploying it myself very soon.
>
> https://dylngg.github.io/resources/arbiterTechPaper.pdf
> 
>
> Prentice
>
> On 4/23/21 10:37 PM, Cristóbal Navarro wrote:
> > Hi Community,
> > I have a set of users still not so familiar with slurm, and yesterday
> > they bypassed srun/sbatch and just ran their CPU program directly on
> > the head/login node thinking it would still run on the compute node. I
> > am aware that I will need to teach them some basic usage, but in the
> > meanwhile, how have you solved this type of user-behavior problem? Is
> > there a preffered way to restrict the master/login resources, or
> > actions,  to the regular users ?
> >
> > many thanks in advance


--
Stefan Stäglich,  Universität Freiburg,  Institut für Informatik
Georges-Köhler-Allee,  Geb.52,   79110 Freiburg,Germany

E-Mail : 
staeg...@informatik.uni-freiburg.de
WWW: gki.informatik.uni-freiburg.de
Telefon: +49 761 203-8223
Fax: +49 761 203-8222



Re: [slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the cloned DB

2022-02-07 Thread Brian Andrus

So if you run sshare on the head node, it shows your dummy user?


At any rate, just do a db dump (also known as a backup) and you can 
restore that if you have an issue of any sort.



Brian Andrus


On 2/7/2022 12:42 AM, Moshe Mergy wrote:


Hi all


I cloned the Slurm DB into a separated node, in order to test and run 
a Dry-Run DB upgrade.


Slurm 20.02.4 --> 21.08.5 on CentOS 7.6 / MariaDB 5.5.64-1.

according to guide 
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#make-a-dry-run-database-upgrade 




DB cloning: OK - mysql shows 2 differents DB with same data:

the original DB on Head machine + the cloned DB on the Node.


MY FEAR: destroying the original DB


MY TEST: creating a dummy user on cloned DB and check that it is not 
present in original DB



Node: slurmdbd is running only (NO slurmd)

Head: slurmdbd, slurmctld are running


1/ On Node "sacctmgr create user dummy" into cloned DB

2/ On Node "sacctmgr show user -s": user dummy is present in cloned 
DB --> OK


3/ On Head "sacctmgr show user -s": user dummy is NOT present in 
original DB --> OK



PROBLEM:

But if I use command "sshare -al" to list the users, the user dummy 
appears to be present in cloned DB... AND in original DB too...!!



QUESTIONS:

It seems (from manual) that sshare uses slurmctld, which is related to 
the original DB on Head machine.


So user dummy should NOT be listed with "sshare" command from Head 
machine. Neither from Node...?!


(And on the other hand, in case sshare does not use slurmctld, but 
send direct requests to the DB, then the dummy user should be listed 
(with sshare) only on Node, but NOT on Head machine...)



Are the 2 DBs really separated ?

So I can run the DB upgrade test on this cloned DB without destroying 
the original DB ?



Thanks a lot for your help!

Regards,


MosheM

(Slurm & Mysql novice)



Re: [slurm-users] [External] What is an easy way to prevent users run programs on the master/login node.

2022-02-07 Thread Stefan Staeglich
Hi,

I've just noticed that the repository https://gitlab.chpc.utah.edu/arbiter2 
seems is down. Does someone know more?

Thank you!

Best,
Stefan

Am Dienstag, 27. April 2021, 17:35:35 CET schrieb Prentice Bisbal:
> I think someone asked this same exact question a few weeks ago. The best
> solution I know of is to use Arbiter, which was created exactly for this
> situation. It uses cgroups to limit resource usage, but it adjusts those
> limits based on login node utilization and each users behavior ("bad"
> users get their resources limited more severely when they do "bad" things.
> 
> I will be deploying it myself very soon.
> 
> https://dylngg.github.io/resources/arbiterTechPaper.pdf
> 
> 
> Prentice
> 
> On 4/23/21 10:37 PM, Cristóbal Navarro wrote:
> > Hi Community,
> > I have a set of users still not so familiar with slurm, and yesterday
> > they bypassed srun/sbatch and just ran their CPU program directly on
> > the head/login node thinking it would still run on the compute node. I
> > am aware that I will need to teach them some basic usage, but in the
> > meanwhile, how have you solved this type of user-behavior problem? Is
> > there a preffered way to restrict the master/login resources, or
> > actions,  to the regular users ?
> > 
> > many thanks in advance


-- 
Stefan Stäglich,  Universität Freiburg,  Institut für Informatik
Georges-Köhler-Allee,  Geb.52,   79110 Freiburg,Germany

E-Mail : staeg...@informatik.uni-freiburg.de
WWW: gki.informatik.uni-freiburg.de
Telefon: +49 761 203-8223
Fax: +49 761 203-8222


smime.p7s
Description: S/MIME cryptographic signature


[slurm-users] Slurm 20.02 Dry-Run DB upgrade - Question on the cloned DB

2022-02-07 Thread Moshe Mergy
Hi all


I cloned the Slurm DB into a separated node, in order to test and run a Dry-Run 
DB upgrade.

Slurm 20.02.4 --> 21.08.5 on CentOS 7.6 / MariaDB 5.5.64-1.

according to guide 
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#make-a-dry-run-database-upgrade


DB cloning: OK - mysql shows 2 differents DB with same data:

the original DB on Head machine + the cloned DB on the Node.


MY FEAR: destroying the original DB


MY TEST: creating a dummy user on cloned DB and check that it is not present in 
original DB


Node: slurmdbd is running only (NO slurmd)

Head: slurmdbd, slurmctld are running


1/ On Node "sacctmgr create user dummy" into cloned DB

2/ On Node "sacctmgr show user -s": user dummy is present in cloned DB --> OK

3/ On Head "sacctmgr show user -s": user dummy is NOT present in original DB 
--> OK


PROBLEM:

But if I use command "sshare -al" to list the users, the user dummy appears to 
be present in cloned DB... AND in original DB too...!!


QUESTIONS:

It seems (from manual) that sshare uses slurmctld, which is related to the 
original DB on Head machine.

So user dummy should NOT be listed with "sshare" command from Head machine. 
Neither from Node...?!

(And on the other hand, in case sshare does not use slurmctld, but send direct 
requests to the DB, then the dummy user should be listed (with sshare) only on 
Node, but NOT on Head machine...)


Are the 2 DBs really separated ?

So I can run the DB upgrade test on this cloned DB without destroying the 
original DB ?


Thanks a lot for your help!

Regards,


MosheM

(Slurm & Mysql novice)