Hi Benjamin,
We have the following set in slurm.conf as you have suggested:
AccountingStorageEnforce=limits,qos
PriorityWeightQOS=1000
And we did
sacctmgr modify qos normal set Grpcpus=300
sacctmgr show qos format=GrpTRES
GrpTRES
-------------
cpu=200
I see that when I submit a job requesting over 200 CPUs, the job get
blocked, which is good.
However, when I submit a job requesting just few CPUs, the job get
blocked as well.
[slurm-testing]$ squeue
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
2064 debug hello_pa slo PD 0:00 10
(QOSGrpCpuLimit)
2065 smallmem c_gth-dz jmcclain PD 0:00 6
(QOSGrpCpuLimit)
Do you know why it thinks the job is over 200 CPU limit? Is there other
setting we need?
Thanks
Steven.
On 10/20/16 2:13 AM, Benjamin Redling wrote:
Hi Steven,
On 10/20/2016 00:22, Steven Lo wrote:
We have the attribute commented out:
#AccountingStorageEnforce=0
I think the best is to (re)visit "Accounting and Resource Limits":
http://slurm.schedmd.com/accounting.html
Right know I have no setup that needs accounting but as far as I
currently understand you'll need AccoutingStorageEnforce=limits,qos to
get your examples to work.
And just in case you already didn't set it:
for QOS (http://slurm.schedmd.com/qos.html)
<quote>
PriorityWeightQOS" configuration parameter must be defined in the
slurm.conf file and assigned an integer value greater than zero.
</quote>
What I am unsure -- esp. not knowing your config -- if there are any
other unmet dependencies.
Would be nice somebody with real experience with accounting could affirm
or give a pointer.
Regards,
Benjamin