On 09/04/2012 04:22 PM, Miguel Méndez wrote:
> Hi Lennart,
>
> I have some questions for you so I can help you:
>
> Have you tried to set DebugFlags=Priority in slurm.conf to get some more
> info about priorities on slurmctld.log?
>
> Are your priorities being recalculated every "PriorityCalcPeriod" (in
> slurm.conf as well, default is 5 min)? If not, do you have Accounting
> enabled?
Hi Miguel,
And thanks for trying to help me!
Yes, I have configured
PriorityCalcPeriod=5
in the slurm.conf file.
I do not understand your question about if I have Accounting enabled.
I have no such configuration variable in my slurm.conf file. I run
I have now tried your suggestion to set DebugFlags=Priority,
so now I can rewrite my question in a new way.
In slurm.conf, I have configured
PriorityMaxAge=14-0
PriorityWeightAge=20160
The plan behind this configuration is to start with an age
value of zero and get approximately one priority point added
for each minute that the job has been waiting, up to a
maximum of 20160.
This has worked for a long time and usually still does.
But sometimes it goes seriously wrong, with a new job starting
at a age value of 20160 instead.
This can be seen with the sprio command and also with Priority
debugging on:
[2012-09-05T10:43:37] Weighted Age priority is 1.000000 * 20160 = 20160.00
[2012-09-05T10:43:37] Weighted Fairshare priority is 10.000000 * 10000 =
100000.00
[2012-09-05T10:43:37] Weighted JobSize priority is 0.001616 * 104 = 0.17
[2012-09-05T10:43:37] Weighted Partition priority is 0.000000 * 0 = 0.00
[2012-09-05T10:43:37] Weighted QOS priority is 0.000000 * 400000 = 0.00
[2012-09-05T10:43:37] Job 2182878 priority: 20160.00 + 100000.00 + 0.17 + 0.00
+ 0.00 - 0 = 120160.17
The job was submitted 2012-09-05T10:42:22, so it should have a weighted
age priority of zero or one, but it got for some unknown reason the
maximum value instead.
Here are a job that behaves the normal way, as expected:
[2012-09-05T10:44:17] Weighted Age priority is 0.000000 * 20160 = 0.00
[2012-09-05T10:44:17] Weighted Fairshare priority is 6.000000 * 10000 = 60000.00
[2012-09-05T10:44:17] Weighted JobSize priority is 0.002874 * 104 = 0.30
[2012-09-05T10:44:17] Weighted Partition priority is 0.000000 * 0 = 0.00
[2012-09-05T10:44:17] Weighted QOS priority is 0.000000 * 400000 = 0.00
[2012-09-05T10:44:17] Job 2182879 priority: 0.00 + 60000.00 + 0.30 + 0.00 +
0.00 - 0 = 60000.30
This job was submitted 2012-09-05T10:44:17, so the weighted age
priority is zero, as expected.
Here is an example for a job that has waited for some time:
[2012-09-05T00:07:31] Weighted Age priority is 0.004721 * 20160 = 95.17
[2012-09-05T00:07:31] Weighted Fairshare priority is 10.000000 * 10000 =
100000.00
[2012-09-05T00:07:31] Weighted JobSize priority is 0.002874 * 104 = 0.30
[2012-09-05T00:07:31] Weighted Partition priority is 0.000000 * 0 = 0.00
[2012-09-05T00:07:31] Weighted QOS priority is 0.300000 * 400000 = 120000.00
[2012-09-05T00:07:31] Job 2178648 priority: 95.17 + 100000.00 + 0.30 + 0.00 +
120000.00 - 0 = 220095.47
Submit time was 2012-09-04T22:32:08, so the Weighted Age
priority works as intended in this case.
This is version 2.4.1 of SLURM. (If someone thinks that the Fairshare
priorities are strange, do not worry. They are intended to be in this
way, but that is another story.)
Full slurm.conf configuration is at the bottom of this e-mail,
with line numbers added.
Cheers,
-- Lennart Karlsson
UPPMAX, Uppsala University, Sweden
http://www.uppmax.uu.se
==============================================
1 ControlMachine=kalkyl2
2 AuthType=auth/munge
3 CacheGroups=0
4 CryptoType=crypto/munge
5 EnforcePartLimits=YES
6 Epilog=/etc/slurm/slurm.epilog
7 JobCredentialPrivateKey=/etc/slurm/slurm.key
8 JobCredentialPublicCertificate=/etc/slurm/slurm.cert
9 JobRequeue=0
10 MaxJobCount=1000000
11 MpiDefault=none
12 Proctracktype=proctrack/cgroup
13 Prolog=/etc/slurm/slurm.prolog
14 PropagateResourceLimits=RSS
15 ReturnToService=0
16 SallocDefaultCommand="/usr/bin/srun -n1 -N1 --pty --preserve-env
--mpi=none -Q $SHELL"
17
SchedulerParameters=default_queue_depth=5000,bf_window=10080,max_job_bf=5000,bf_interval=120
18 SlurmctldPidFile=/var/run/slurmctld.pid
19 SlurmctldPort=6817
20 SlurmdPidFile=/var/run/slurmd.pid
21 SlurmdPort=6818
22 SlurmdSpoolDir=/var/spool/slurmd
23 SlurmUser=slurm
24 StateSaveLocation=/usr/local/slurm-state
25 SwitchType=switch/none
26 TaskPlugin=task/cgroup
27 TaskProlog=/etc/slurm/slurm.taskprolog
28 TopologyPlugin=topology/tree
29 TmpFs=/scratch
30 TrackWCKey=yes
31 TreeWidth=20
32 UsePAM=1
33 HealthCheckInterval=1800
34 HealthCheckProgram=/etc/slurm/slurm.healthcheck
35 InactiveLimit=0
36 KillWait=600
37 MessageTimeout=60
38 ResvOverRun=UNLIMITED
39 MinJobAge=43200
40 SlurmctldTimeout=300
41 SlurmdTimeout=1200
42 Waittime=0
43 FastSchedule=1
44 MaxMemPerCPU=3072
45 SchedulerType=sched/backfill
46 SchedulerPort=7321
47 SelectType=select/cons_res
48 SelectTypeParameters=CR_Core_Memory
49 PriorityType=priority/multifactor
50 PriorityDecayHalfLife=0
51 PriorityCalcPeriod=5
52 PriorityUsageResetPeriod=MONTHLY
53 PriorityFavorSmall=NO
54 PriorityMaxAge=14-0
55 PriorityWeightAge=20160
56 PriorityWeightFairshare=10000
57 PriorityWeightJobSize=104
58 PriorityWeightPartition=0
59 PriorityWeightQOS=400000
60 AccountingStorageEnforce=associations,limits,qos
61 AccountingStorageHost=kalkyl2
62 AccountingStoragePort=7031
63 AccountingStorageType=accounting_storage/slurmdbd
64 ClusterName=kalkyl
65 DebugFlags=NO_CONF_HASH,Priority
66 JobCompLoc=/etc/slurm/slurm_jobcomp_logger
67 JobCompType=jobcomp/script
68 JobAcctGatherFrequency=30
69 JobAcctGatherType=jobacct_gather/linux
70 SlurmctldDebug=3
71 SlurmctldLogFile=/var/log/slurm/slurmctld.log
72 SlurmdDebug=3
73 SlurmdLogFile=/var/log/slurm/slurmd.log
74 NodeName=DEFAULT Sockets=2 CoresPerSocket=4 ThreadsPerCore=1
State=UNKNOWN TmpDisk=100000
75
76 NodeName=q[1-16] RealMemory=72000 Feature=fat,mem72GB,ibsw1
Weight=3
77 NodeName=q[17-32] RealMemory=48000 Feature=fat,mem48GB,ibsw1
Weight=2
78 NodeName=q[33-64] RealMemory=24000 Feature=thin,mem24GB,ibsw2
Weight=1
79 NodeName=q[65-96] RealMemory=24000 Feature=thin,mem24GB,ibsw3
Weight=1
80 NodeName=q[97-108] RealMemory=24000 Feature=thin,mem24GB,ibsw4
Weight=1
81 NodeName=q[109-140] RealMemory=24000 Feature=thin,mem24GB,ibsw5
Weight=1
82 NodeName=q[141-172] RealMemory=24000 Feature=thin,mem24GB,ibsw6
Weight=1
83 NodeName=q[173-204] RealMemory=24000 Feature=thin,mem24GB,ibsw7
Weight=1
84 NodeName=q[205-216] RealMemory=24000 Feature=thin,mem24GB,ibsw8
Weight=1
85
86 NodeName=q[217-232] RealMemory=24000 Feature=thin,mem24GB,ibsw4
Weight=1
87
88 NodeName=q[233-252] RealMemory=24000 Feature=thin,mem24GB,ibsw8
Weight=1
89 NodeName=q[253-284] RealMemory=24000 Feature=thin,mem24GB,ibsw9
Weight=1
90 NodeName=q[285-316] RealMemory=24000 Feature=thin,mem24GB,ibsw10
Weight=1
91 NodeName=q[317-348] RealMemory=24000 Feature=thin,mem24GB,ibsw11
Weight=1
92
93 PartitionName=all Nodes=q[1-348] Shared=EXCLUSIVE DefaultTime=00:00:01
MaxTime=14400 State=DOWN
94 PartitionName=core Nodes=q[45-348] Default=YES Shared=NO MaxTime=14400
MaxNodes=1 State=UP
95 PartitionName=node Nodes=q[1-32,45-348] Shared=EXCLUSIVE
DefaultTime=00:00:01 MaxTime=14400 State=UP
96 PartitionName=devel Nodes=q[33-44] Shared=EXCLUSIVE
DefaultTime=00:00:01 MaxTime=60 MaxNodes=4 State=UP