Re: [slurm-users] Exposing only requested CPUs to a job on a given node.

2021-05-14 Thread Ryan Cox
You can check with something like this inside of a job:  cat 
/sys/fs/cgroup/cpuset/slurm/uid_$UID/job_$SLURM_JOB_ID/cpuset.cpus. That 
lists which cpus you have access to.


On 5/14/21 4:40 PM, Renfro, Michael wrote:


Untested, but prior experience with cgroups indicates that if things 
are working correctly, even if your code tries to run as many 
processes as you have cores, those processes will be confined to the 
cores you reserve.


Try a more compute-intensive worker function that will take some 
seconds or minutes to complete, and watch the reserved node with 'top' 
or a similar program. If for example, the job reserved only 1 core and 
tried to run 20 processes, you'd see 20 processes in 'top', each at 5% 
CPU time.


To make the code a bit more polite, you can import the os module and 
create a new variable from the SLURM_CPUS_ON_NODE environment variable 
to guide Python into starting the correct number of processes:


    cpus_reserved = int(os.environ['SLURM_CPUS_ON_NODE'])

*From: *slurm-users  on behalf 
of Rodrigo Santibáñez 

*Date: *Friday, May 14, 2021 at 5:17 PM
*To: *Slurm User Community List 
*Subject: *Re: [slurm-users] Exposing only requested CPUs to a job on 
a given node.


*External Email Warning*

*This email originated from outside the university. Please use caution 
when opening attachments, clicking links, or responding to requests.*




Hi you all,

I'm replying to have notifications answering this question. I have a 
user whose python script used almost all CPUs, but configured to use 
only 6 cpus per task. I reviewed the code, and it doesn't have an 
explicit call to multiprocessing or similar. So the user is unaware of 
this behavior (and also me).


Running slurm 20.02.6

Best!

On Fri, May 14, 2021 at 1:37 PM Luis R. Torres > wrote:


Hi Folks,

We are currently running on SLURM 20.11.6 with cgroups constraints
for memory and CPU/Core.  Can the scheduler only expose the
requested number of CPU/Core resources to a job?  We have some
users that employ python scripts with the multi processing
modules, and the scripts apparently use all of the CPU/Cores in a
node, despite using options to constraint a task to just a given
number of CPUs.    We would like several multiprocessing jobs to
run simultaneously on the nodes, but not step on each other.

The sample script I use for testing is below; I'm looking for
something similar to what can be done with the GPU Gres
configuration where only the number of GPUs requested are exposed
to the job requesting them.

#!/usr/bin/env python3

import multiprocessing

def worker():

print("Worker on CPU #%s" % multiprocessing.current_process

().name)

    result=0

    for j in range(20):

      result += j**2

    print ("Result on CPU {} is {}".format(multiprocessing.curr

ent_process().name,result))

    return

if __name__ == '__main__':

    pool = multiprocessing.Pool()

    jobs = []

    print ("This host exposed {} CPUs".format(multiprocessing.c

pu_count()))

    for i in range(multiprocessing.cpu_count()):

        p = multiprocessing.Process(target=worker, name=i).star

t()

Thanks,

-- 



Luis R. Torres



--
Ryan Cox
Director
Office of Research Computing
Brigham Young University



Re: [slurm-users] Determining Cluster Usage Rate

2021-05-14 Thread Christopher Samuel

On 5/14/21 1:45 am, Diego Zuccato wrote:


Usage reported in Percentage of Total
 

   Cluster  TRES Name    Allocated Down PLND Dow    Idle 
Reserved Reported
- --    --- 
 
   oph    cpu   81.93%    0.00%    0.00%  15.85% 
2.22%  100.00%
   oph    mem   80.60%    0.00%    0.00%  19.40% 
0.00%  100.00%


The "Reserved" column is the one you're interested in, it's indicating 
that for the 13th some jobs were waiting for CPUs, not memory.


You can look at a longer reporting period by specifying a start date,
something like:

sreport -t percent -T cpu,mem cluster utilization start=2021-01-01

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] Determining Cluster Usage Rate

2021-05-14 Thread Christopher Samuel

On 5/14/21 1:45 am, Diego Zuccato wrote:


It just doesn't recognize 'ALL'. It works if I specify the resources.


That's odd, what does this say?

sreport --version

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] Exposing only requested CPUs to a job on a given node.

2021-05-14 Thread Renfro, Michael
Untested, but prior experience with cgroups indicates that if things are 
working correctly, even if your code tries to run as many processes as you have 
cores, those processes will be confined to the cores you reserve.

Try a more compute-intensive worker function that will take some seconds or 
minutes to complete, and watch the reserved node with 'top' or a similar 
program. If for example, the job reserved only 1 core and tried to run 20 
processes, you'd see 20 processes in 'top', each at 5% CPU time.

To make the code a bit more polite, you can import the os module and create a 
new variable from the SLURM_CPUS_ON_NODE environment variable to guide Python 
into starting the correct number of processes:

cpus_reserved = int(os.environ['SLURM_CPUS_ON_NODE'])

From: slurm-users  on behalf of Rodrigo 
Santibáñez 
Date: Friday, May 14, 2021 at 5:17 PM
To: Slurm User Community List 
Subject: Re: [slurm-users] Exposing only requested CPUs to a job on a given 
node.

External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.


Hi you all,

I'm replying to have notifications answering this question. I have a user whose 
python script used almost all CPUs, but configured to use only 6 cpus per task. 
I reviewed the code, and it doesn't have an explicit call to multiprocessing or 
similar. So the user is unaware of this behavior (and also me).

Running slurm 20.02.6

Best!

On Fri, May 14, 2021 at 1:37 PM Luis R. Torres 
mailto:lrtor...@gmail.com>> wrote:
Hi Folks,

We are currently running on SLURM 20.11.6 with cgroups constraints for memory 
and CPU/Core.  Can the scheduler only expose the requested number of CPU/Core 
resources to a job?  We have some users that employ python scripts with the 
multi processing modules, and the scripts apparently use all of the CPU/Cores 
in a node, despite using options to constraint a task to just a given number of 
CPUs.We would like several multiprocessing jobs to run simultaneously on 
the nodes, but not step on each other.

The sample script I use for testing is below; I'm looking for something similar 
to what can be done with the GPU Gres configuration where only the number of 
GPUs requested are exposed to the job requesting them.




#!/usr/bin/env python3

import multiprocessing



def worker():

print("Worker on CPU #%s" % multiprocessing.current_process

().name)

result=0

for j in range(20):

  result += j**2

print ("Result on CPU {} is {}".format(multiprocessing.curr

ent_process().name,result))

return



if __name__ == '__main__':

pool = multiprocessing.Pool()

jobs = []

print ("This host exposed {} CPUs".format(multiprocessing.c

pu_count()))

for i in range(multiprocessing.cpu_count()):

p = multiprocessing.Process(target=worker, name=i).star

t()

Thanks,
--

Luis R. Torres


Re: [slurm-users] Exposing only requested CPUs to a job on a given node.

2021-05-14 Thread Rodrigo Santibáñez
Hi you all,

I'm replying to have notifications answering this question. I have a user
whose python script used almost all CPUs, but configured to use only 6 cpus
per task. I reviewed the code, and it doesn't have an explicit call to
multiprocessing or similar. So the user is unaware of this behavior (and
also me).

Running slurm 20.02.6

Best!

On Fri, May 14, 2021 at 1:37 PM Luis R. Torres  wrote:

> Hi Folks,
>
> We are currently running on SLURM 20.11.6 with cgroups constraints for
> memory and CPU/Core.  Can the scheduler only expose the requested number of
> CPU/Core resources to a job?  We have some users that employ python scripts
> with the multi processing modules, and the scripts apparently use all of
> the CPU/Cores in a node, despite using options to constraint a task to just
> a given number of CPUs.We would like several multiprocessing jobs to
> run simultaneously on the nodes, but not step on each other.
>
> The sample script I use for testing is below; I'm looking for something
> similar to what can be done with the GPU Gres configuration where only the
> number of GPUs requested are exposed to the job requesting them.
>
>
> #!/usr/bin/env python3
>
> import multiprocessing
>
>
> def worker():
>
> print("Worker on CPU #%s" % multiprocessing.current_process
>
> ().name)
>
> result=0
>
> for j in range(20):
>
>   result += j**2
>
> print ("Result on CPU {} is {}".format(multiprocessing.curr
>
> ent_process().name,result))
>
> return
>
>
> if __name__ == '__main__':
>
> pool = multiprocessing.Pool()
>
> jobs = []
>
> print ("This host exposed {} CPUs".format(multiprocessing.c
>
> pu_count()))
>
> for i in range(multiprocessing.cpu_count()):
>
> p = multiprocessing.Process(target=worker, name=i).star
>
> t()
>
> Thanks,
> --
> 
> Luis R. Torres
>


[slurm-users] schedule mixed nodes first

2021-05-14 Thread Durai Arasan
Hi,

Frequently all of our GPU nodes (8xGPU each) are in MIXED state and there
is no IDLE node. Some jobs require a complete node (all 8 GPUs) and such
jobs therefore have to wait really long before they can run.

Is there a way of improving this situation? E.g. by not blocking IDLE nodes
with jobs that only use a fraction of the 8 GPUs? Why are single GPU jobs
not scheduled to fill already MIXED nodes before using IDLE ones?

What parameters/configuration need to be adjusted for this to be enforced?

Our current scheduling configuration:

slurm.conf:
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory

gres.conf (one node example):
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[0-3]
COREs=0-17,36-53
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[4-7]
COREs=18-35,54-71


Thank you,
Durai
Competence center for Machine Learning Tübingen


[slurm-users] Exposing only requested CPUs to a job on a given node.

2021-05-14 Thread Luis R. Torres
Hi Folks,

We are currently running on SLURM 20.11.6 with cgroups constraints for
memory and CPU/Core.  Can the scheduler only expose the requested number of
CPU/Core resources to a job?  We have some users that employ python scripts
with the multi processing modules, and the scripts apparently use all of
the CPU/Cores in a node, despite using options to constraint a task to just
a given number of CPUs.We would like several multiprocessing jobs to
run simultaneously on the nodes, but not step on each other.

The sample script I use for testing is below; I'm looking for something
similar to what can be done with the GPU Gres configuration where only the
number of GPUs requested are exposed to the job requesting them.


#!/usr/bin/env python3

import multiprocessing


def worker():

print("Worker on CPU #%s" % multiprocessing.current_process

().name)

result=0

for j in range(20):

  result += j**2

print ("Result on CPU {} is {}".format(multiprocessing.curr

ent_process().name,result))

return


if __name__ == '__main__':

pool = multiprocessing.Pool()

jobs = []

print ("This host exposed {} CPUs".format(multiprocessing.c

pu_count()))

for i in range(multiprocessing.cpu_count()):

p = multiprocessing.Process(target=worker, name=i).star

t()

Thanks,
-- 

Luis R. Torres


Re: [slurm-users] Determining Cluster Usage Rate

2021-05-14 Thread Paul Edmon
XDMod can give these sorts of stats.  I also have some diamond 
collectors we use in concert with grafana to pull data and plot it which 
is useful for seeing large scale usage trends:


https://github.com/fasrc/slurm-diamond-collector

-Paul Edmon-

On 5/13/2021 6:08 PM, Sid Young wrote:


Hi All,

Is there a way to define an effective "usage rate" of a HPC Cluster 
using the data captured in the slurm database.


Primarily I want to see if it can be helpful in presenting to the 
business a case for buying more hardware for the HPC :)


Sid Young


Re: [slurm-users] Different GPU types on the same server

2021-05-14 Thread David Gauchard

Hello, FWIW we did this with gres.conf and slurm.conf:

in node's /etc/slurm/gres.conf:

AutoDetect=off
Name=gpu Type=quadro_k620 File=/dev/nvidia0 CPUs=0-0
Name=gpu Type=nvs_510 File=/dev/nvidia1 CPUs=1-1
Name=gpu Type=nvs_510 File=/dev/nvidia2 CPUs=2-2

in server's slurm.conf:

NodeName=gputesthost CPUs=4 RealMemory=3500 Sockets=1 CoresPerSocket=4 
ThreadsPerCore=1 Gres=gpu:quadro_k620:1,gpu:nvs_510:2


When submitting jobs, gpu is selected by user by that kind of request:
sbatch --gres=gpu:quadro_k620 ...

This is on a test host with old cards.
I don't know whether `CPUs` on each line should partition the host or 
not (=> CPUs=0-3 for all lines)


david

On 5/14/21 12:28 PM, Emyr James wrote:

Dear all,


We currently have a single gpu capable server with 10x RTX2080Ti in it. 
One of our research groups wants to replace one of these cards with an 
RTX3090 but only if we can give them a higher priority on that 
particular card.


Is it possible to set up a queue that only includes a specific subset of 
the cards on a server? Any way to use gres's etc. to achieve this?



Many thanks,


Emyr James
Head of IT
CRG - Centre for Genomic Regulation
C/ Dr. Aiguader, 88
Edif. PRBB
08003 Barcelona, Spain
Phone Ext: #1098





[slurm-users] Different GPU types on the same server

2021-05-14 Thread Emyr James
Dear all,


We currently have a single gpu capable server with 10x RTX2080Ti in it. One of 
our research groups wants to replace one of these cards with an RTX3090 but 
only if we can give them a higher priority on that particular card.

Is it possible to set up a queue that only includes a specific subset of the 
cards on a server? Any way to use gres's etc. to achieve this?


Many thanks,


Emyr James
Head of IT
CRG - Centre for Genomic Regulation
C/ Dr. Aiguader, 88
Edif. PRBB
08003 Barcelona, Spain
Phone Ext: #1098



Re: [slurm-users] Determining Cluster Usage Rate

2021-05-14 Thread Diego Zuccato

Il 14/05/21 10:24, Ole Holm Nielsen ha scritto:

Referring to https://slurm.schedmd.com/tres.html, which TRES are defined 
on your cluster?

It just doesn't recognize 'ALL'. It works if I specify the resources.

root@str957-cluster:/var/log# sacctmgr show tres
TypeName ID
 --- --
 cpu  1
 mem  2
  energy  3
node  4
 billing  5
  fsdisk  6
vmem  7
   pages  8
root@str957-cluster:/var/log# sreport -t percent -T ALL cluster utilization
sreport: fatal: No valid TRES given
root@str957-cluster:/var/log# sreport -t percent -T cpu,mem cluster 
utilization


Cluster Utilization 2021-05-13T00:00:00 - 2021-05-13T23:59:59
Usage reported in Percentage of Total

  Cluster  TRES NameAllocated Down PLND DowIdle 
Reserved Reported
- --    --- 
 
  ophcpu   81.93%0.00%0.00%  15.85% 
2.22%  100.00%
  ophmem   80.60%0.00%0.00%  19.40% 
0.00%  100.00%


BYtE,
 Diego

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786



Re: [slurm-users] Determining Cluster Usage Rate

2021-05-14 Thread Ole Holm Nielsen

On 14-05-2021 08:52, Diego Zuccato wrote:

Il 14/05/2021 08:19, Christopher Samuel ha scritto:


sreport -t percent -T ALL cluster utilization

"sreport: fatal: No valid TRES given" :(


This works correctly on our cluster:

$  sreport -t percent -T ALL cluster utilization

Cluster Utilization 2021-05-13T00:00:00 - 2021-05-13T23:59:59
Usage reported in Percentage of Total

  Cluster  TRES Name  AllocatedDown PLND Dow 
Idle Reserved   Reported
- -- -- ---  
-  --
 niflheimcpu 98.22%   0.11%0.00% 
0.00%1.67%100.00%
 niflheimmem 86.52%   0.10%0.00% 
13.38%0.00%100.00%
 niflheim energy  0.00%   0.00%0.00% 
0.00%0.00%  0.00%
 niflheimbilling 92.70%   0.04%0.00% 
7.26%0.00%100.00%
 niflheimfs/disk  0.00%   0.00%0.00% 
0.00%0.00%  0.00%
 niflheim   vmem  0.00%   0.00%0.00% 
0.00%0.00%  0.00%
 niflheim  pages  0.00%   0.00%0.00% 
0.00%0.00%  0.00%



Referring to https://slurm.schedmd.com/tres.html, which TRES are defined 
on your cluster?


$ sacctmgr show tres

I get this output:

TypeName ID
 --- --
 cpu  1
 mem  2
  energy  3
node  4
 billing  5
  fsdisk  6
vmem  7
   pages  8

/Ole