Re: [slurm-users] [External] What is an easy way to prevent users run programs on the master/login node.

2021-06-11 Thread Juergen Salk
Hi,

I can't speak specifically for arbiter but to my very best knowledge
this is just how cgroup memory limits work in general, i.e. both,
anonymous memory and page cache, always count against the cgroup 
memory limit.

This also applies for memory constraints imposed to compute jobs if 
ConstrainRAMSpace=yes is set in cgroup.conf. 

Best regards
Jürgen


* Stefan Staeglich  [210611 14:01]:
> Hi Prentice,
> 
> thanks for the hint. I'm evaluating this too.
> 
> Seems that arbiter doesn't distinguish between RAM that's used really and RAM 
> that's sused as cache only. Or is my impression wrong?
> 
> Best,
> Stefan
> 
> Am Dienstag, 27. April 2021, 17:35:35 CEST schrieb Prentice Bisbal:
> > I think someone asked this same exact question a few weeks ago. The best
> > solution I know of is to use Arbiter, which was created exactly for this
> > situation. It uses cgroups to limit resource usage, but it adjusts those
> > limits based on login node utilization and each users behavior ("bad"
> > users get their resources limited more severely when they do "bad" things.
> > 
> > I will be deploying it myself very soon.
> > 
> > https://dylngg.github.io/resources/arbiterTechPaper.pdf
> > 
> > 
> > Prentice
> > 
> > On 4/23/21 10:37 PM, Cristóbal Navarro wrote:
> > > Hi Community,
> > > I have a set of users still not so familiar with slurm, and yesterday
> > > they bypassed srun/sbatch and just ran their CPU program directly on
> > > the head/login node thinking it would still run on the compute node. I
> > > am aware that I will need to teach them some basic usage, but in the
> > > meanwhile, how have you solved this type of user-behavior problem? Is
> > > there a preffered way to restrict the master/login resources, or
> > > actions,  to the regular users ?
> > > 
> > > many thanks in advance
> 
> 
> -- 
> Stefan Stäglich,  Universität Freiburg,  Institut für Informatik
> Georges-Köhler-Allee,  Geb.52,   79110 Freiburg,Germany
> 
> E-Mail : staeg...@informatik.uni-freiburg.de
> WWW: gki.informatik.uni-freiburg.de
> Telefon: +49 761 203-8223
> Fax: +49 761 203-8222
> 



Re: [slurm-users] [EXT] Re: Slurm Scheduler Help

2021-06-11 Thread Dana, Jason T.
Thank you for the response!

I have given those parameters a shot and will monitor the queue.

These parameters would really only impact backfill with respect to job time 
limits, correct? Based on what I have read, I was under the impression that the 
main scheduler and the backfill scheduler were partition independent. Meaning 
that if I have a large number of jobs queued for a single partition, another 
partition which has no jobs running would be seen as free by the scheduler and 
should not queue jobs with respect to the filled partition. Do you know if I’m 
mistaken?

Jason Dana
JHUAPL
REDD/RA2
Senior Systems Administrator/Software Engineer
jason.d...@jhuapl.edu
240-564-1045 (w)

Need Support from REDD?  You can enter a ticket using the new REDD Help Desk 
Portal (https://help.rcs.jhuapl.edu/servicedesk) if you have an active account 
or e-mail redd-h...@outermail.jhuapl.edu.

From: slurm-users  on behalf of "Renfro, 
Michael" 
Reply-To: Slurm User Community List 
Date: Friday, June 11, 2021 at 2:16 PM
To: Slurm User Community List , 
"slurm-us...@schedmd.com" 
Subject: [EXT] Re: [slurm-users] Slurm Scheduler Help

APL external email warning: Verify sender slurm-users-boun...@lists.schedmd.com 
before clicking links or attachments



Not sure it would work out to 60k queued jobs, but we're using:

SchedulerParameters=bf_window=43200,bf_resolution=2160,bf_max_job_user=80,bf_continue,default_queue_depth=200

in our setup. bf_window is driven by our 30-day max job time, bf_resolution is 
at 5% of that time, and the other values are just what we landed on. This did 
manage to address some backfill issues we had in previous years.

From: slurm-users  on behalf of Dana, 
Jason T. 
Date: Friday, June 11, 2021 at 12:27 PM
To: slurm-us...@schedmd.com 
Subject: [slurm-users] Slurm Scheduler Help

External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.


Hello,

I currently manage a small cluster separated into 4 partitions. I am 
experiencing unexpected behavior with the scheduler when the queue has been 
flooded with a large number of jobs by a single user (around 6) to a single 
partition. We have each user bound to a global grptres CPU limit. Once this 
user reaches their CPU limit the jobs are queued with reason 
“AssocGroupCpuLimit” but after a few hundred or so of the jobs it seems to 
switch to “Priority”. The issue is that once this switch occurs it appears to 
also impact all other partitions. Currently if any job is submitted to any of 
the partitions, regardless of resources available, they are all queued by the 
scheduler with the reason of “Priority”. We had the scheduler initially 
configured for backfill but have also tried switching to builtin and it did not 
seem to make a difference. I tried increasing the default_queue_depth to 10 
and it didn’t seem to help. The scheduler log is also unhelpful as it simply 
lists the accounting-limited jobs and never mentions the “Priority” queued jobs:

sched: [2021-06-11T13:21:53.993] JobId=495780 delayed for accounting policy
sched: [2021-06-11T13:21:53.997] JobId=495781 delayed for accounting policy
sched: [2021-06-11T13:21:54.001] JobId=495782 delayed for accounting policy
sched: [2021-06-11T13:21:54.005] JobId=495783 delayed for accounting policy
sched: [2021-06-11T13:21:54.005] loop taking too long, breaking out

I’ve gone through all the documentation I’ve found on the scheduler and cannot 
seem to resolve this. I’m hoping I’m simply missing something.

Any help would be great. Thank you!

Jason



Re: [slurm-users] Slurm Scheduler Help

2021-06-11 Thread Renfro, Michael
Not sure it would work out to 60k queued jobs, but we're using:

SchedulerParameters=bf_window=43200,bf_resolution=2160,bf_max_job_user=80,bf_continue,default_queue_depth=200

in our setup. bf_window is driven by our 30-day max job time, bf_resolution is 
at 5% of that time, and the other values are just what we landed on. This did 
manage to address some backfill issues we had in previous years.

From: slurm-users  on behalf of Dana, 
Jason T. 
Date: Friday, June 11, 2021 at 12:27 PM
To: slurm-us...@schedmd.com 
Subject: [slurm-users] Slurm Scheduler Help

External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.


Hello,

I currently manage a small cluster separated into 4 partitions. I am 
experiencing unexpected behavior with the scheduler when the queue has been 
flooded with a large number of jobs by a single user (around 6) to a single 
partition. We have each user bound to a global grptres CPU limit. Once this 
user reaches their CPU limit the jobs are queued with reason 
“AssocGroupCpuLimit” but after a few hundred or so of the jobs it seems to 
switch to “Priority”. The issue is that once this switch occurs it appears to 
also impact all other partitions. Currently if any job is submitted to any of 
the partitions, regardless of resources available, they are all queued by the 
scheduler with the reason of “Priority”. We had the scheduler initially 
configured for backfill but have also tried switching to builtin and it did not 
seem to make a difference. I tried increasing the default_queue_depth to 10 
and it didn’t seem to help. The scheduler log is also unhelpful as it simply 
lists the accounting-limited jobs and never mentions the “Priority” queued jobs:

sched: [2021-06-11T13:21:53.993] JobId=495780 delayed for accounting policy
sched: [2021-06-11T13:21:53.997] JobId=495781 delayed for accounting policy
sched: [2021-06-11T13:21:54.001] JobId=495782 delayed for accounting policy
sched: [2021-06-11T13:21:54.005] JobId=495783 delayed for accounting policy
sched: [2021-06-11T13:21:54.005] loop taking too long, breaking out

I’ve gone through all the documentation I’ve found on the scheduler and cannot 
seem to resolve this. I’m hoping I’m simply missing something.

Any help would be great. Thank you!

Jason



[slurm-users] Slurm Scheduler Help

2021-06-11 Thread Dana, Jason T.
Hello,

I currently manage a small cluster separated into 4 partitions. I am 
experiencing unexpected behavior with the scheduler when the queue has been 
flooded with a large number of jobs by a single user (around 6) to a single 
partition. We have each user bound to a global grptres CPU limit. Once this 
user reaches their CPU limit the jobs are queued with reason 
“AssocGroupCpuLimit” but after a few hundred or so of the jobs it seems to 
switch to “Priority”. The issue is that once this switch occurs it appears to 
also impact all other partitions. Currently if any job is submitted to any of 
the partitions, regardless of resources available, they are all queued by the 
scheduler with the reason of “Priority”. We had the scheduler initially 
configured for backfill but have also tried switching to builtin and it did not 
seem to make a difference. I tried increasing the default_queue_depth to 10 
and it didn’t seem to help. The scheduler log is also unhelpful as it simply 
lists the accounting-limited jobs and never mentions the “Priority” queued jobs:

sched: [2021-06-11T13:21:53.993] JobId=495780 delayed for accounting policy
sched: [2021-06-11T13:21:53.997] JobId=495781 delayed for accounting policy
sched: [2021-06-11T13:21:54.001] JobId=495782 delayed for accounting policy
sched: [2021-06-11T13:21:54.005] JobId=495783 delayed for accounting policy
sched: [2021-06-11T13:21:54.005] loop taking too long, breaking out

I’ve gone through all the documentation I’ve found on the scheduler and cannot 
seem to resolve this. I’m hoping I’m simply missing something.

Any help would be great. Thank you!

Jason



Re: [slurm-users] [External] What is an easy way to prevent users run programs on the master/login node.

2021-06-11 Thread Stefan Staeglich
Hi Prentice,

thanks for the hint. I'm evaluating this too.

Seems that arbiter doesn't distinguish between RAM that's used really and RAM 
that's sused as cache only. Or is my impression wrong?

Best,
Stefan

Am Dienstag, 27. April 2021, 17:35:35 CEST schrieb Prentice Bisbal:
> I think someone asked this same exact question a few weeks ago. The best
> solution I know of is to use Arbiter, which was created exactly for this
> situation. It uses cgroups to limit resource usage, but it adjusts those
> limits based on login node utilization and each users behavior ("bad"
> users get their resources limited more severely when they do "bad" things.
> 
> I will be deploying it myself very soon.
> 
> https://dylngg.github.io/resources/arbiterTechPaper.pdf
> 
> 
> Prentice
> 
> On 4/23/21 10:37 PM, Cristóbal Navarro wrote:
> > Hi Community,
> > I have a set of users still not so familiar with slurm, and yesterday
> > they bypassed srun/sbatch and just ran their CPU program directly on
> > the head/login node thinking it would still run on the compute node. I
> > am aware that I will need to teach them some basic usage, but in the
> > meanwhile, how have you solved this type of user-behavior problem? Is
> > there a preffered way to restrict the master/login resources, or
> > actions,  to the regular users ?
> > 
> > many thanks in advance


-- 
Stefan Stäglich,  Universität Freiburg,  Institut für Informatik
Georges-Köhler-Allee,  Geb.52,   79110 Freiburg,Germany

E-Mail : staeg...@informatik.uni-freiburg.de
WWW: gki.informatik.uni-freiburg.de
Telefon: +49 761 203-8223
Fax: +49 761 203-8222






Re: [slurm-users] Job requesting two different GPUs on two

2021-06-11 Thread Gestió Servidors
Hi,

I have tried with

> 
> #!/bin/bash
> #
> #SBATCH --job-name=N2n4
> #SBATCH --partition=cuda.q
> #SBATCH --output=N2n4-CUDA.txt
> #SBATCH -N 1 # number of nodes with the first GPU
> #SBATCH -n 2 # number of cores
> #SBATCH --gres=gpu:GeForceRTX3080:1
> #SBATCH hetjob
> #SBATCH -N 1 # number of nodes with the second GPU
> #SBATCH -n 2 # number of cores
> #SBATCH --gres=gpu:GeForceRTX2070:1
> ...

but job runs only in first GPU node... 

Help...

Thanks a lot!