Hi Bjørn-Helge,

On 6/23/22 09:18, Bjørn-Helge Mevik wrote:
<gerard....@cines.fr> writes:

TRESRaw cpu is lower than before as I'm alone on the system an no other job was 
submitted.
Any explanation of this ?

I'd guess you have turned on FairShare priorities.  Unfortunately, in
Slurm the same internal variables are used for fairshare calculations as
for GrpTRESMins (and similar), so when fair share priorities are in use,
slurm will reduce accumulated GrpTRESMins over time.  This means that it
is impossible(*) to use GrpTRESMins limits and fairshare
priorities at the same time.

This is a surprising observation!  We use a 14 days HalfLife in slurm.conf:
PriorityDecayHalfLife=14-0

Since our longest running jobs can run only 7 days, maybe our limits never get reduced as you describe?

The slurm.conf man-page says that PriorityDecayHalfLife affects hard time limits per association:

       PriorityDecayHalfLife
              This controls how long  prior  resource  use  is  considered  in
              determining how over- or under-serviced an association is (user,
              bank account and cluster)  in  determining  job  priority.   The
              record  of  usage  will  be  decayed over time, with half of the
              original value cleared at age PriorityDecayHalfLife.  If set  to
              0  no  decay  will  be  applied.  This is helpful if you want to
              enforce hard time limits per association.  If set to  0  Priori‐
              tyUsageResetPeriod  must  be  set  to some interval.  Applicable
              only if PriorityType=priority/multifactor.  The unit is  a  time
              string  (i.e.  min, hr:min:00, days-hr:min:00, or days-hr).  The
              default value is 7-0 (7 days).

Is this what explains your statement?

BTW, I've written a handy script for displaying user limits in a readable format:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/showuserlimits

/Ole

Reply via email to