[ https://issues.apache.org/jira/browse/YUNIKORN-21?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092910#comment-17092910 ]
Tao Yang commented on YUNIKORN-21: ---------------------------------- Thanks [~wangda] for the feedback. bq. if you can have a PoC patch, I can help with review and give more detailed suggestions. I would like to create a PoC PR later, will let you know then. Thanks very much. bq. After another thought, I think the weighted sort policy may make sense if the number of scorers involved is small (no more than 3), I want to avoid 20 scorers involved and give a weighted score which we cannot explain at all Make sense to me. I think there maybe various kinds of scorers for selection, but 3 scorers are good enough for most scenarios. > Revisit node sorting algorithm for fairness > ------------------------------------------- > > Key: YUNIKORN-21 > URL: https://issues.apache.org/jira/browse/YUNIKORN-21 > Project: Apache YuniKorn > Issue Type: Improvement > Components: core - scheduler > Reporter: Wangda Tan > Priority: Major > Attachments: Improve node sorting algorithm v1.pdf, Improve node > sorting algorithm v2.pdf > > > Currently, we're using DominantRatio for the node sorting algorithm > {code:java} > func CompUsageShares(left, right *Resource) int { > lshares := getShares(left,nil) rshares := getShares(right,nil) > return compareShares(lshares, rshares) > }{code} > Which is not good, two reasons: > # Dominate resource compare is about 8X more expensive than single float > compares for two resource types. > # Dominate resource is not stable when we have scarce resource types like > GPU. A node with 192GB mem, 32 vcores, and 1 GPU available, compared to 168GB > mem, 64 vcore and 8 GPU available; the prior one can go first because of the > following logic: > {code:java} > if total == nil || total.Resources[k] == 0 { > // negative share is logged > if v < 0 { > log.Logger().Debug("usage is negative no total, share is also negative", > zap.Int64("resource quantity", int64(v))) > } > shares[idx] = float64(v) idx++ continue > }{code} > I think we should discard dominate resource compare for node resource. > Instead, we just use one resource type (like vcores) to compare available > resource. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org