Hi,
there is a difference in the output of scontrol show job and sprio (14.11.4). I
have two jobs, one was submitted before slurmctld
was restarted, the other one after the restart.
sprio -l shows:
JOBID USER PRIORITY AGE FAIRSHARE JOBSIZE PARTITION
QOS NICE
14115 XXX 10000 0 0 0 0
0 -10000
14204 XXX 1000 0 0 0 0
1000 0
while scontrol shows:
JobId=14115 JobName=XXX
UserId=XXX(XXX) GroupId=XXX(XXX)
Priority=1233 Nice=0 Account=XXX QOS=normal
[…]
JobId=14204 JobName=XXX
UserId=XXX(XXX) GroupId=XXX(XXX)
Priority=1000 Nice=0 Account=XXX QOS=normal
While the shown priority is the same for the second job for both programs, it
differs for the first (10000 vs. 1233). The priority
for job 14115 was at 1233 before the restart.
My understanding is that the actual priority should be saved and restored when
slurmctld is restarted. This is not the case. It
seems that slurmctld forgets the priority and assigns an arbitraty but higher
number to jobs that were on the queue, but then also
reduces niceness. It also forgets the different sub-priorities that come from
the different proportions of the multifactor
priority plugin.
Regards,
Uwe