Hi Benjamin,

We have the following set in slurm.conf as you have suggested:

AccountingStorageEnforce=limits,qos
PriorityWeightQOS=1000

And we did

sacctmgr modify qos normal set Grpcpus=300

sacctmgr show qos format=GrpTRES
      GrpTRES
-------------
      cpu=200


I see that when I submit a job requesting over 200 CPUs, the job get blocked, which is good. However, when I submit a job requesting just few CPUs, the job get blocked as well.

[slurm-testing]$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2064 debug hello_pa slo PD 0:00 10 (QOSGrpCpuLimit) 2065 smallmem c_gth-dz jmcclain PD 0:00 6 (QOSGrpCpuLimit)


Do you know why it thinks the job is over 200 CPU limit? Is there other setting we need?


Thanks

Steven.


On 10/20/16 2:13 AM, Benjamin Redling wrote:
Hi Steven,

On 10/20/2016 00:22, Steven Lo wrote:
We have the attribute commented out:
#AccountingStorageEnforce=0
I think the best is to (re)visit "Accounting and Resource Limits":
http://slurm.schedmd.com/accounting.html

Right know I have no setup that needs accounting but as far as I
currently understand you'll need AccoutingStorageEnforce=limits,qos to
get your examples to work.
And just in case you already didn't set it:
for QOS (http://slurm.schedmd.com/qos.html)
<quote>
PriorityWeightQOS" configuration parameter must be defined in the
slurm.conf file and assigned an integer value greater than zero.
</quote>

What I am unsure -- esp. not knowing your config -- if there are any
other unmet dependencies.
Would be nice somebody with real experience with accounting could affirm
or give a pointer.

Regards,
Benjamin

Reply via email to