[slurm-users] Re: [ext] API - Specify GPUs

2024-07-26 Thread Hagdorn, Magnus Karl Moritz via slurm-users
On Fr, 2024-07-26 at 19:34 +, jpuerto--- via slurm-users wrote:
> It does not seem that the REST API allows for folks to configure
> their jobs to utilize GPUs, using the traditional methods. IE, there
> does not appear to be an equivalent between the --gpus (or --gres)
> flag on sbatch/srun and the REST API's job submission endpoint. Can
> anyone point me towards what should be used if the version we are on
> does not support tres specifications?

I think the API only supports tres specification. that is certainly how
I got it to work.


smime.p7s
Description: S/MIME cryptographic signature

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: [ext] scrontab question

2024-05-07 Thread Hagdorn, Magnus Karl Moritz via slurm-users
Hm, strange. I don't see a problem with the time specs, although I
would use
*/5 * * * *
to run something every 5 minutes. In my scrontab I also specify a
partition, etc. But I don't think that is necessary.
regards
magnus

On Di, 2024-05-07 at 12:06 -0500, Sandor via slurm-users wrote:
> I am working out the details of scrontab. My initial testing is
> giving me an unsolvable question
> Within scrontab editor I have the following example from the slurm
> documentation:
> 
> 0,5,10,15,20,25,30,35,40,45,50,55 * * * *
> /directory/subdirectory/crontest.sh
> 
> When I save it, scrontab marks the line with #BAD: I do not
> understand why. The only difference I have is the directory
> structure.
> 
> Is there an underlying assumption that traditional Linux crontab is
> available to the general user?
> 



smime.p7s
Description: S/MIME cryptographic signature

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: [ext] Re: canonical way to run longer shell/bash interactive job (instead of srun inside of screen/tmux at front-end)?

2024-02-28 Thread Hagdorn, Magnus Karl Moritz via slurm-users
On Tue, 2024-02-27 at 08:21 -0800, Brian Andrus via slurm-users wrote:
> for us, we put a load balancer in front of the login nodes with
> session 
> affinity enabled. This makes them land on the same backend node each
> time.

Hi Brian,
that sounds interesting - how did you implement session affinity?
cheers
magnus

-- 
Magnus Hagdorn
Charité – Universitätsmedizin Berlin
Geschäftsbereich IT | Scientific Computing
 
Campus Charité Mitte
BALTIC - Invalidenstraße 120/121
10115 Berlin
 
magnus.hagd...@charite.de
https://www.charite.de
HPC Helpdesk: sc-hpc-helpd...@charite.de


smime.p7s
Description: S/MIME cryptographic signature

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: [ext] Restricting local disk storage of jobs

2024-02-06 Thread Hagdorn, Magnus Karl Moritz via slurm-users
Hi Tim,
in the end the InitScript didn't contain anything useful because 

slurmd: error: _parse_next_key: Parsing error at unrecognized key:
InitScript

At this stage I gave up. This was with SLURM 23.02. My plan was to
setup the local scratch directory with XFS and then get the script to
apply a project quota, ie quota attached to the directory.

I would start by checking if slurm recognises the InitScript option. 

Regards
magnus

On Tue, 2024-02-06 at 15:24 +0100, Tim Schneider wrote:
> Hi Magnus,
> 
> thanks for your reply! If you can, would you mind sharing the
> InitScript 
> of your attempt at getting it to work?
> 
> Best,
> 
> Tim
> 
> On 06.02.24 15:19, Hagdorn, Magnus Karl Moritz wrote:
> > Hi Tim,
> > we are using the container/tmpfs plugin to map /tmp to a local NVMe
> > drive which works great. I did consider setting up directory
> > quotas. I
> > thought the InitScript [1] option should do the trick. Alas, I
> > didn't
> > get it to work. If I remember correctly, slurm complained about the
> > option being present. In the end we recommend our users to make
> > exclusive use a node if they are going to use a lot of local
> > scratch
> > space. I don't think this happens very often if at all.
> > Regards
> > magnus
> > 
> > [1]
> > https://slurm.schedmd.com/job_container.conf.html#OPT_InitScript
> > 
> > 
> > On Tue, 2024-02-06 at 14:39 +0100, Tim Schneider via slurm-users
> > wrote:
> > > Hi,
> > > 
> > > In our SLURM cluster, we are using the job_container/tmpfs plugin
> > > to
> > > ensure that each user can use /tmp and it gets cleaned up after
> > > them.
> > > Currently, we are mapping /tmp into the nodes RAM, which means
> > > that
> > > the
> > > cgroups make sure that users can only use a certain amount of
> > > storage
> > > inside /tmp.
> > > 
> > > Now we would like to use of the node's local SSD instead of its
> > > RAM
> > > to
> > > hold the files in /tmp. I have seen people define local storage
> > > as
> > > GRES,
> > > but I am wondering how to make sure that users do not exceed the
> > > storage
> > > space they requested in a job. Does anyone have an idea how to
> > > configure
> > > local storage as a proper tracked resource?
> > > 
> > > Thanks a lot in advance!
> > > 
> > > Best,
> > > 
> > > Tim
> > > 
> > > 

-- 
Magnus Hagdorn
Charité – Universitätsmedizin Berlin
Geschäftsbereich IT | Scientific Computing
 
Campus Charité Mitte
BALTIC - Invalidenstraße 120/121
10115 Berlin
 
magnus.hagd...@charite.de
https://www.charite.de
HPC Helpdesk: sc-hpc-helpd...@charite.de


smime.p7s
Description: S/MIME cryptographic signature

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: [ext] Restricting local disk storage of jobs

2024-02-06 Thread Hagdorn, Magnus Karl Moritz via slurm-users
Hi Tim,
we are using the container/tmpfs plugin to map /tmp to a local NVMe
drive which works great. I did consider setting up directory quotas. I
thought the InitScript [1] option should do the trick. Alas, I didn't
get it to work. If I remember correctly, slurm complained about the
option being present. In the end we recommend our users to make
exclusive use a node if they are going to use a lot of local scratch
space. I don't think this happens very often if at all.
Regards
magnus

[1] 
https://slurm.schedmd.com/job_container.conf.html#OPT_InitScript


On Tue, 2024-02-06 at 14:39 +0100, Tim Schneider via slurm-users wrote:
> Hi,
> 
> In our SLURM cluster, we are using the job_container/tmpfs plugin to 
> ensure that each user can use /tmp and it gets cleaned up after them.
> Currently, we are mapping /tmp into the nodes RAM, which means that
> the 
> cgroups make sure that users can only use a certain amount of storage
> inside /tmp.
> 
> Now we would like to use of the node's local SSD instead of its RAM
> to 
> hold the files in /tmp. I have seen people define local storage as
> GRES, 
> but I am wondering how to make sure that users do not exceed the
> storage 
> space they requested in a job. Does anyone have an idea how to
> configure 
> local storage as a proper tracked resource?
> 
> Thanks a lot in advance!
> 
> Best,
> 
> Tim
> 
> 

-- 
Magnus Hagdorn
Charité – Universitätsmedizin Berlin
Geschäftsbereich IT | Scientific Computing
 
Campus Charité Mitte
BALTIC - Invalidenstraße 120/121
10115 Berlin
 
magnus.hagd...@charite.de
https://www.charite.de
HPC Helpdesk: sc-hpc-helpd...@charite.de


smime.p7s
Description: S/MIME cryptographic signature

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com