When computing tiered allocation statistics, the normal step is to shrink the resource showing the most errors. However, for some abstract resources, like N+1 redundancy, there is no physical resource this concept refers to; nevertheless, there is an underlying physical resource that most likely causes this kind of failure. For N+1 redundancy, the missing resource almost always is memory. So shrink based on this assumption.
Signed-off-by: Klaus Aehlig <[email protected]> --- src/Ganeti/HTools/Cluster.hs | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/Ganeti/HTools/Cluster.hs b/src/Ganeti/HTools/Cluster.hs index 891e3f9..a8313bf 100644 --- a/src/Ganeti/HTools/Cluster.hs +++ b/src/Ganeti/HTools/Cluster.hs @@ -861,6 +861,13 @@ sufficesShrinking allocFn inst fm = of x:_ -> Just . snd $ x _ -> Nothing +-- | For a failure determine the underlying resource that most likely +-- causes this kind of failure. In particular, N+1 violations are most +-- likely caused by lack of memory. +underlyingCause :: FailMode -> FailMode +underlyingCause FailN1 = FailMem +underlyingCause x = x + -- | Tiered allocation method. -- -- This places instances on the cluster, and decreases the spec until @@ -877,7 +884,8 @@ tieredAlloc opts nl il limit newinst allocnodes ixes cstats = Nothing -> (False, Nothing) Just n -> (n <= ixes_cnt, Just (n - ixes_cnt)) - sortedErrs = map fst $ sortBy (flip $ comparing snd) errs + sortedErrs = nub . map (underlyingCause . fst) + $ sortBy (flip $ comparing snd) errs suffShrink = sufficesShrinking (fromMaybe emptyAllocSolution . flip (tryAlloc opts nl' il') allocnodes) -- 2.5.0.rc2.392.g76e840b
