[slurm-users] Re: SLURM configuration for LDAP users

2024-02-05 Thread Loris Bennett via slurm-users
Hi Richard, Richard Chang via slurm-users writes: > Job submission works for local users. I was not aware we need to manually > add the LDAP users to the SlurmDB. Does it mean we need to add each and every > user in LDAP to the Slurm database ? We add users to the Slurm DB automatically with

[slurm-users] Re: Starting a job after a file is created in previous job (dependency looking for soluton)

2024-02-06 Thread Loris Bennett via slurm-users
Hi Ajad, Amjad Syed via slurm-users writes: > Hello > > I have the following scenario: > I need to submit a sequence of up to 400 jobs where the even jobs depend on > the preceeding odd job to finish and every odd job depends on the presence of > a > file generated by the preceding even job (a

[slurm-users] job_submit.lua - uid in Docker cluster

2024-02-14 Thread Loris Bennett via slurm-users
Hi, Having used https://github.com/giovtorres/slurm-docker-cluster successfully a couple of years ago to develop a job_submit.lua plugin, I am trying to do this again. However, the plugin which works on our current cluster (CentOS 7.9, Slurm 23.02.7) fails in the Docker cluster (Rocky 8.9, S

[slurm-users] Re: Suggestions for Partition/QoS configuration

2024-04-04 Thread Loris Bennett via slurm-users
Hi Thomas, "thomas.hartmann--- via slurm-users" writes: > Hi, > we're testing possible slurm configurations on a test system right now. > Eventually, it is going to serve ~1000 users. > > We're going to have some users who are going to run lots of short jobs > (a couple of minutes to ~4h) and s

[slurm-users] Re: Avoiding fragmentation

2024-04-08 Thread Loris Bennett via slurm-users
Hi Gerhard, Gerhard Strangar via slurm-users writes: > Hi, > > I'm trying to figure out how to deal with a mix of few- and many-cpu > jobs. By that I mean most jobs use 128 cpus, but sometimes there are > jobs with only 16. As soon as that job with only 16 is running, the > scheduler splits the

[slurm-users] Re: scheduling according time requirements

2024-04-30 Thread Loris Bennett via slurm-users
Hi Dietmar, Dietmar Rieder via slurm-users writes: > Hi, > > is it possible to have slurm scheduling jobs automatical according to > the "-t" time requirements to a fitting partition? > > e.g. 3 partitions > > PartitionName=standard Nodes=c-[01-10] Default=YES MaxTime=04:00:00 > DefaultTime=00:1

[slurm-users] Re: [EXTERN] Re: scheduling according time requirements

2024-04-30 Thread Loris Bennett via slurm-users
Hi Dietmar, Dietmar Rieder via slurm-users writes: > Hi Loris, > > On 4/30/24 2:53 PM, Loris Bennett via slurm-users wrote: >> Hi Dietmar, >> Dietmar Rieder via slurm-users >> writes: >> >>> Hi, >>> >>> is it possible to have slur

[slurm-users] Re: [EXTERN] Re: scheduling according time requirements

2024-04-30 Thread Loris Bennett via slurm-users
Hi Dietmar, Dietmar Rieder via slurm-users writes: > Hi Loris, > > On 4/30/24 3:43 PM, Loris Bennett via slurm-users wrote: >> Hi Dietmar, >> Dietmar Rieder via slurm-users >> writes: >> >>> Hi Loris, >>> >>> On 4/30/24 2

[slurm-users] Re: GPU GRES verification and some really broad questions.

2024-05-10 Thread Loris Bennett via slurm-users
Hi, Shooktija S N via slurm-users writes: > Hi, > > I am a complete slurm-admin and sys-admin noob trying to set up a 3 node > Slurm cluster. I have managed to get a minimum working example running, in > which I am able to use a GPU (NVIDIA GeForce RTX 4070 ti) as a GRES. > > This is slurm.co

[slurm-users] Re: diagnosing why interactive/non-interactive job waits are so long with State=MIXED

2024-06-05 Thread Loris Bennett via slurm-users
Ryan Novosielski via slurm-users writes: > We do have bf_continue set. And also bf_max_job_user=50, because we > discovered that one user can submit so many jobs that it will hit the limit > of the number > it’s going to consider and not run some jobs that it could otherwise run. > > On Jun 4

[slurm-users] Re: sbatch: Node count specification invalid - when only specifying --ntasks

2024-06-10 Thread Loris Bennett via slurm-users
Hi George, George Leaver via slurm-users writes: > Hello, > > Previously we were running 22.05.10 and could submit a "multinode" job > using only the total number of cores to run, not the number of nodes. > For example, in a cluster containing only 40-core nodes (no > hyperthreading), Slurm woul

[slurm-users] Re: sbatch: Node count specification invalid - when only specifying --ntasks

2024-06-11 Thread Loris Bennett via slurm-users
Hi George, George Leaver via slurm-users writes: > Hi Loris, > >> Doesn't splitting up your jobs over two partitions mean that either >> one of the two partitions could be full, while the other has idle >> nodes? > > Yes, potentially, and we may move away from our current config at some > point

[slurm-users] Re: How to exclude master from computing? Set to DRAINED?

2024-06-24 Thread Loris Bennett via slurm-users
Hi Xaver, Xaver Stiensmeier via slurm-users writes: > Dear Slurm users, > > in our project we exclude the master from computing before starting > Slurmctld. We used to exclude the master from computing by simply not > mentioning it in > the configuration i.e. just not having: > > Partition

[slurm-users] Re: Unable to run sequential jobs simultaneously on the same node

2024-08-18 Thread Loris Bennett via slurm-users
Dear Arko, Arko Roy via slurm-users writes: > I want to run 50 sequential jobs (essentially 50 copies of the same code with > different input parameters) on a particular node. However, as soon as one of > the > jobs gets executed, the other 49 jobs get killed immediately with exit code > 9.

[slurm-users] Re: Unable to run sequential jobs simultaneously on the same node

2024-08-19 Thread Loris Bennett via slurm-users
Dear Arko, Arko Roy writes: > Thanks Loris and Gareth. here is the job submission script. if you find any > errors please let me know. > since i am not the admin but just an user, i think i dont have access to the > prolog and epilogue files. > > If the jobs are independent, why do you want to

[slurm-users] salloc not starting shell despite LaunchParameters=use_interactive_step

2024-09-05 Thread Loris Bennett via slurm-users
Hi, With $ salloc --version slurm 23.11.10 and $ grep LaunchParameters /etc/slurm/slurm.conf LaunchParameters=use_interactive_step the following $ salloc --partition=interactive --ntasks=1 --time=00:03:00 --mem=1000 --qos=standard salloc: Granted job allocation 18928869 sal

[slurm-users] Re: salloc not starting shell despite LaunchParameters=use_interactive_step

2024-09-05 Thread Loris Bennett via slurm-users
02068@lt10000 ~]$ > > Best Regards, > Carsten > > Am 05.09.24 um 14:17 schrieb Loris Bennett via slurm-users: > > Hi, > > > > With > > > >$ salloc --version > >slurm 23.11.10 > > > > and > > > >

[slurm-users] Re: salloc not starting shell despite LaunchParameters=use_interactive_step

2024-09-06 Thread Loris Bennett via slurm-users
e data points. Cheers, Loris > -Paul Edmon- > > On 9/5/24 10:22 AM, Loris Bennett via slurm-users wrote: >> Jason Simms via slurm-users writes: >> >>> Ours works fine, however, without the InteractiveStepOptions parameter. >> My assumption is also that default v