On Thu, Sep 08, 2016 at 04:12:56PM +0100, 'Federico Morg Pareschi' via ganeti-devel wrote: > This patch adds a starvation prevention mechanism to the Ganeti > predictive job queue. It calculates the age of submitted jobs > comparing the current time to the submission time and appropriately > adjusts the job's lock weight accordingly to avoid starvation.
As you've said in person, this is actually pretty simple. Simple is good. :) Mostly LGTM, some comments inline... Cheers, Brian. > Signed-off-by: Federico Morg Pareschi <[email protected]> > --- > src/Ganeti/Constants.hs | 11 +++++++++++ > src/Ganeti/JQScheduler.hs | 21 +++++++++++++++------ > src/Ganeti/JQueue/LockDecls.hs | 14 +++++++++++++- > 3 files changed, 39 insertions(+), 7 deletions(-) > > diff --git a/src/Ganeti/Constants.hs b/src/Ganeti/Constants.hs > index 1e0b877..bee8af3 100644 > --- a/src/Ganeti/Constants.hs > +++ b/src/Ganeti/Constants.hs > @@ -5634,3 +5634,14 @@ staticLockMaybeBlockWeight = 1.5 > -- | Weight assigned to two locks that will surely conflict. > staticLockSureBlockWeight :: Double > staticLockSureBlockWeight = 3 > + > +-- | How many seconds make up a "tick" in the job queue starvation prevention > +-- system. > +jobQueueTickInSeconds :: Double > +jobQueueTickInSeconds = 30 > + > +-- | Job queue starvation prevention coefficient. This means how many "ticks" > +-- of time need to pass before a job has 100% certainty to be in front of the > +-- queue. > +jobQueueStarvationCoeff :: Double > +jobQueueStarvationCoeff = 30 I suspect we might want these two as cluster parameters like max_running_jobs. At first I thought of exposing these as luxid command line options, but then you'd have the problem of copying updates to these options in /etc/defaults/ganeti to all nodes, so having them as cluster params is better. [snip] > +-- | Adjust any static weight to its appropriate anti-starvation value. It > +-- takes the current time and the time since the job was queued as > parameters. > +adjustedWeight :: Int -> JQ.Timestamp -> Double -> Double > +adjustedWeight currTime (recvTime, _) weight = > + let timeVal = max 0 (1 - ticks / C.jobQueueStarvationCoeff) > + ticks = timeDiff / C.jobQueueTickInSeconds > + timeDiff = fromIntegral $ currTime - recvTime > + in weight * timeVal The type signature of this is quite odd: two timestamps, one integer and the other JQ.Timestamp. Could you make both be JQ.Timestamp (I know it's just a sec/usec tuple), and use JQueue.currentTimestamp to get the current time. JQ.Timestamp seems to be used pretty heavily in the rest of the scheduler code.
