[ 
https://issues.apache.org/jira/browse/YUNIKORN-21?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092325#comment-17092325
 ] 

Wangda Tan commented on YUNIKORN-21:
------------------------------------

Thanks [~Tao Yang], 

I took a quick look at the doc, I'm not sure if I fully understand the 
following details: 

1) What is the difference between node-sorting policy and evaluator? If the 
node-order is solely based on the evaluator, we can use one single component to 
replace the two? (Or we only expose sorter interface and the evaluator becomes 
implementation details). 

2) The scope of incremental sorting algorithm is not very clear to me, are we 
going to maintain a sorted list for every request? It might be too much if we 
want to do it on a per-request basis (we could have many different requests).

3) I don't quite sure about the "Weight" concept, are we going to support a 
multi node scorer like K8s default scheduler? I personally don't prefer that 
way, since a weighted result is not easy to explain the behavior.

4) How fast we can do the resorting? Since node list is keep changing, and 
node's status also changing fast, are we going to keep an always up-to-date 
sorting result, or we will have some latencies. (If we need pre-sorted node 
lists on a per-request basis, there're too many sorted node lists we need to 
maintain).

> Revisit node sorting algorithm for fairness
> -------------------------------------------
>
>                 Key: YUNIKORN-21
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-21
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: core - scheduler
>            Reporter: Wangda Tan
>            Priority: Major
>         Attachments: Improve node sorting algorithm v1.pdf, Improve node 
> sorting algorithm v2.pdf
>
>
> Currently, we're using DominantRatio for the node sorting algorithm
> {code:java}
> func CompUsageShares(left, right *Resource) int {
>  lshares := getShares(left,nil) rshares := getShares(right,nil)
>  return compareShares(lshares, rshares) 
> }{code}
> Which is not good, two reasons:
>  # Dominate resource compare is about 8X more expensive than single float 
> compares for two resource types.
>  # Dominate resource is not stable when we have scarce resource types like 
> GPU. A node with 192GB mem, 32 vcores, and 1 GPU available, compared to 168GB 
> mem, 64 vcore and 8 GPU available; the prior one can go first because of the 
> following logic:
> {code:java}
> if total == nil || total.Resources[k] == 0 {
>  // negative share is logged
>  if v < 0 {
>   log.Logger().Debug("usage is negative no total, share is also negative", 
> zap.Int64("resource quantity", int64(v))) 
>  }
>  shares[idx] = float64(v) idx++ continue
> }{code}
> I think we should discard dominate resource compare for node resource. 
> Instead, we just use one resource type (like vcores) to compare available 
> resource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org

Reply via email to