On Thu, Sep 08, 2016 at 04:12:56PM +0100, 'Federico Morg Pareschi' via 
ganeti-devel wrote:
> This patch adds a starvation prevention mechanism to the Ganeti
> predictive job queue. It calculates the age of submitted jobs
> comparing the current time to the submission time and appropriately
> adjusts the job's lock weight accordingly to avoid starvation.

As you've said in person, this is actually pretty simple. Simple is good. :)

Mostly LGTM, some comments inline...

Cheers,
Brian.
> Signed-off-by: Federico Morg Pareschi <[email protected]>
> ---
>  src/Ganeti/Constants.hs        | 11 +++++++++++
>  src/Ganeti/JQScheduler.hs      | 21 +++++++++++++++------
>  src/Ganeti/JQueue/LockDecls.hs | 14 +++++++++++++-
>  3 files changed, 39 insertions(+), 7 deletions(-)
> 
> diff --git a/src/Ganeti/Constants.hs b/src/Ganeti/Constants.hs
> index 1e0b877..bee8af3 100644
> --- a/src/Ganeti/Constants.hs
> +++ b/src/Ganeti/Constants.hs
> @@ -5634,3 +5634,14 @@ staticLockMaybeBlockWeight = 1.5
>  -- | Weight assigned to two locks that will surely conflict.
>  staticLockSureBlockWeight :: Double
>  staticLockSureBlockWeight = 3
> +
> +-- | How many seconds make up a "tick" in the job queue starvation prevention
> +-- system.
> +jobQueueTickInSeconds :: Double
> +jobQueueTickInSeconds = 30
> +
> +-- | Job queue starvation prevention coefficient. This means how many "ticks"
> +-- of time need to pass before a job has 100% certainty to be in front of the
> +-- queue.
> +jobQueueStarvationCoeff :: Double
> +jobQueueStarvationCoeff = 30

I suspect we might want these two as cluster parameters like max_running_jobs.

At first I thought of exposing these as luxid command line options, but
then you'd have the problem of copying updates to these options in
/etc/defaults/ganeti to all nodes, so having them as cluster params is better.

[snip]
> +-- | Adjust any static weight to its appropriate anti-starvation value. It
> +-- takes the current time and the time since the job was queued as 
> parameters.
> +adjustedWeight :: Int -> JQ.Timestamp -> Double -> Double
> +adjustedWeight currTime (recvTime, _) weight =
> +  let timeVal = max 0 (1 - ticks / C.jobQueueStarvationCoeff)
> +      ticks = timeDiff / C.jobQueueTickInSeconds
> +      timeDiff = fromIntegral $ currTime - recvTime
> +  in weight * timeVal

The type signature of this is quite odd: two timestamps, one integer and the
other JQ.Timestamp. Could you make both be JQ.Timestamp (I know it's just a
sec/usec tuple), and use JQueue.currentTimestamp to get the current time.
JQ.Timestamp seems to be used pretty heavily in the rest of the scheduler code.

Reply via email to