Re: [slurm-users] slurm.conf

2024-01-18 Thread Cutts, Tim
Can you not also do this with a single configuration file but configuring 
multiple clusters which the user can choose with the -M option?  I suppose it 
depends on the use case; if you want to be able to choose a dev cluster over 
the production one, to test new config options, then the environment variable 
approach makes sense.  If this is actually multiple clusters that the users are 
using in production, then the -M approach might work better?

Tim

--
Tim Cutts
Scientific Computing Platform Lead
AstraZeneca

Find out more about R IT Data, Analytics & AI and how we can support you by 
visiting our Service 
Catalogue |


On 18/01/2024, 12:07, "slurm-users"  
wrote:
LEROY Christine 208562 
mailto:christine.ler...@cea.fr>> writes:

> Is there an env variable in SLURM to tell where the slurm.conf is?
> We would like to have on the same client node, 2 type of possible submissions 
> to address 2 different cluster.

According to man sbatch:

   SLURM_CONFThe location of the Slurm configuration file.

--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo




AstraZeneca UK Limited is a company incorporated in England and Wales with 
registered number:03674842 and its registered office at 1 Francis Crick Avenue, 
Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only 
and may contain confidential and privileged information. If they have come to 
you in error, you must not copy or show them to anyone; instead, please reply 
to this e-mail, highlighting the error to the sender and then immediately 
delete the message. For information about how AstraZeneca UK Limited and its 
affiliates may process information, personal data and monitor communications, 
please see our privacy notice at 
www.astrazeneca.com


Re: [slurm-users] slurm.conf

2024-01-18 Thread Hermann Schwärzler

Hi Christine,

yes, you can either set the environment variable SLURM_CONF to the full 
path of the configuration-file you want to use and then run any program.


Or you can do it like this

SLURM_CONF=/your/path/to/slurm.conf sinfo|sbatch|srun|...

But I am not quite sure if this is really the best way to address your 
needs? :-)


Regards,
Hermann

On 1/18/24 10:58, LEROY Christine 208562 wrote:

Hello all,

Is there an env variable in SLURM to tell where the slurm.conf is?
We would like to have on the same client node, 2 type of possible submissions 
to address 2 different cluster.

Thanks in advance,
Christine




Re: [slurm-users] slurm.conf

2024-01-18 Thread Bjørn-Helge Mevik
LEROY Christine 208562  writes:

> Is there an env variable in SLURM to tell where the slurm.conf is?
> We would like to have on the same client node, 2 type of possible submissions 
> to address 2 different cluster.

According to man sbatch:

   SLURM_CONFThe location of the Slurm configuration file.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo



signature.asc
Description: PGP signature


Re: [slurm-users] slurm.conf syntax checker?

2021-10-26 Thread Marcus Wagner

Hi Diego,

sorry for the delay.


On 10/18/21 14:20, Diego Zuccato wrote:

Il 15/10/2021 06:02, Marcus Wagner ha scritto:

mostly, our problem was, that we forgot to add/remove a node to/from 
the partitions/topology file, which caused slurmctld to deny startup. 
So I wrote a simple checker for that. Here is the output of a sample 
run:
Even "just" catching syntax errors and the most common errors is 
already a big help, expecially for noobs :)



[OK]: All nodeweights are correct.

What do you mean with this? How can weights be "incorrect"?


We are using nodeweights calculated out of different factors,  like cpu 
generation, memory, cores and available generic resources.
We have e.g. some nodes with additional NVMe disks, these should be 
scheduled later than the nodes without NVMes, but can be forced for 
scheduling by asking for the constraint nvme.
My checker does calculate these weights, so I do not have to calculate 
these by myself, just insert the calculated value.

Example output (instead of "[OK]: All nodeweights are correct.")
NodeName=lns[07-08] Sockets=8 
CoresPerSocket=18 ThreadsPerCore=1 RealMemory=102 
Feature=broadwell,bwx8860,nvme,hostok,hpcwork Gres=gpu:pascal:1  
Weight=111544(was 1) State=UNKNOWN


So, the correct weight is 111544, but I set it to "1" in the configfile. 
The checker tells me "Weight=111544(was 1)", that the correct value for 
this kind of node would be 111544 and not "1".


Best
Marcus



If someone is interested ...Surely I am :)




--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de




Re: [slurm-users] slurm.conf syntax checker?

2021-10-18 Thread Diego Zuccato

Il 15/10/2021 06:02, Marcus Wagner ha scritto:

mostly, our problem was, that we forgot to add/remove a node to/from the 
partitions/topology file, which caused slurmctld to deny startup. So I 
wrote a simple checker for that. Here is the output of a sample run:
Even "just" catching syntax errors and the most common errors is already 
a big help, expecially for noobs :)



[OK]: All nodeweights are correct.

What do you mean with this? How can weights be "incorrect"?


If someone is interested ...Surely I am :)


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786



Re: [slurm-users] slurm.conf syntax checker?

2021-10-14 Thread Marcus Wagner

mostly, our problem was, that we forgot to add/remove a node to/from the 
partitions/topology file, which caused slurmctld to deny startup. So I wrote a 
simple checker for that. Here is the output of a sample run:

reading '../conf/rcc/slurm.conf' ...
reading '../conf/rcc/nodes.conf' ...
reading '../conf/rcc/partitions.conf' ...
reading '../conf/rcc/topology.conf' ...
reading '../conf/rcc/gres.conf' ...

[OK]: All nodeweights are correct.

[OK]: All nodes are defined only once.
[OK]: All nodes are used in partitions.
[OK]: There are no nonexisting nodes in the partitions.

[OK]: No nodes seen more than once in topology file.
[OK]: There are no nodes missing in topology.conf
[OK]: All nodes in topology.conf exist in slurm.conf

WARNING: GRES checking not yet implemented.

If someone is interested ...


Best
Marcus

Am 13.10.2021 um 15:36 schrieb Paul Edmon:

Sadly no.  There is a feature request for one though: 
https://bugs.schedmd.com/show_bug.cgi?id=3435

What we've done in the meantime is put together a gitlab runner which basically 
starts up a mini instance of the scheduler and runs slurmctld on the slurm.conf 
we want to put in place.  We then have it reject any changes that cause 
failure.  It's not perfect but it works.  A real syntax checker would be better.

-Paul Edmon-

On 10/12/2021 4:08 PM, bbenede...@goodyear.com wrote:

Is there any sort of syntax checker that we could run our slurm.conf file
through before committing it?  (And sometimes crashing slurmctld in the
process...)

Thanks!





--
Dipl.-Inf. Marcus Wagner

IT Center
Gruppe: Server, Storage, HPC
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de

Social Media Kanäle des IT Centers:
https://blog.rwth-aachen.de/itc/
https://www.facebook.com/itcenterrwth
https://www.linkedin.com/company/itcenterrwth
https://twitter.com/ITCenterRWTH
https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [slurm-users] slurm.conf syntax checker?

2021-10-13 Thread Paul Edmon
Sadly no.  There is a feature request for one though: 
https://bugs.schedmd.com/show_bug.cgi?id=3435


What we've done in the meantime is put together a gitlab runner which 
basically starts up a mini instance of the scheduler and runs slurmctld 
on the slurm.conf we want to put in place.  We then have it reject any 
changes that cause failure.  It's not perfect but it works.  A real 
syntax checker would be better.


-Paul Edmon-

On 10/12/2021 4:08 PM, bbenede...@goodyear.com wrote:

Is there any sort of syntax checker that we could run our slurm.conf file
through before committing it?  (And sometimes crashing slurmctld in the
process...)

Thanks!