[slurm-dev] SLURM allows jobs to start even if they use more CPUs than requested

Rémi Piatek Tue, 23 Jun 2015 01:48:10 -0700


Hello,

I am getting started with SLURM and I am having a hard timeunderstanding how it allocates CPUs to users depending on the resourcesthey request. The problem I am facing can be summarized as follows.Consider a bash script test.sh that requests 8 CPUs but actually startsa job that uses 10 CPUs:


    #!/bin/sh
    #SBATCH --ntasks=8
    stress -c 10

On a server with 32 CPUs, if I start 5 times this script with sbatchtest.sh, 4 of them start running right away and the last one appears aspending, as shown by the squeue command:


    JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
        5      main  test.sh     jack PD       0:00      1 (Resources)
        1      main  test.sh     jack  R       0:08      1 server
        2      main  test.sh     jack  R       0:08      1 server
        3      main  test.sh     jack  R       0:05      1 server
        4      main  test.sh     jack  R       0:05      1 server

The problem is that these 4 jobs are actually using 40 CPUs and overloadthe server. I would on the contrary expect SLURM to either not start thejobs that are actually using more resources than requested by the user,or to put them on hold until there are enough resources to start them.How can I make sure that the users of my server do not start jobs thatuse too many CPUs?


Some useful details about my slurm.conf file:

    # SCHEDULING
    #DefMemPerCPU=0
    FastSchedule=1
    #MaxMemPerCPU=0
    SchedulerType=sched/backfill
    SchedulerPort=7321
    SelectType=select/cons_res
    SelectTypeParameters=CR_CPU
    # COMPUTE NODES
    NodeName=server CPUs=32 RealMemory=10000 State=UNKNOWN
    # PARTITIONS

PartitionName=main Nodes=server Default=YES Shared=YESMaxTime=INFINITE State=UP

I am probably making a trivial mistake in the configuration file, ofjust misunderstanding a basic concept of SLURM. Any help or advice wouldbe much appreciated.


Many thanks in advance!

[slurm-dev] SLURM allows jobs to start even if they use more CPUs than requested

Reply via email to