[ https://issues.apache.org/jira/browse/YUNIKORN-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Craig Condit closed YUNIKORN-1347. ---------------------------------- Resolution: Implemented > Yunikorn triggers EKS auto-scaling even pods requests have exceeded the queue > limit > ------------------------------------------------------------------------------------ > > Key: YUNIKORN-1347 > URL: https://issues.apache.org/jira/browse/YUNIKORN-1347 > Project: Apache YuniKorn > Issue Type: Bug > Components: core - scheduler, shim - kubernetes > Reporter: Anthony Wu > Priority: Major > > Hi guys, > We are trying to utilise Yunikorn to manage our AWS EKS infrastructure to > limit resource usage for different users and groups. We also use k8s cluster > auto-scaler > ([https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler]) > for auto scaling of the cluster when necessary. > *Environment* > * AWS EKS on k8s 1.21 > * Yunikorn 1.1 running as k8s scheduler plugin to be most compatible > * cluster-autoscaler V1.21.0 > {*}Issues{*}: > Let's say we have quene has be below limit > {code:yaml} > queues: > - name: dev > submitacl: "*" > resources: > max: > memory: 100Gi > vcore: 10 > {code} > > Then we try to create 4 pods in the `dev` queue each requires 5 cores and > 50Gi memory > Then we are getting 2 pods {{Running}} and 2 pods {{{}Pending{}}}, because > the queue has reached its limit of 10Gi memory and 10 cpus. > We would expect the queued pods to not triggering EKS auto scaling, as they > would not be able to be allocated until other resources have been release in > the queue. > But what we see is that, the Queued pods still trigger the cluster > auto-scaling regardless. As shown in the example below: > {code:java} > Status: Pending > ... > Conditions: > Type Status > PodScheduled False > Events: > Type Reason Age From Message > ---- ------ ---- ---- ------- > Warning FailedScheduling 3m5s yunikorn 0/147 nodes are > available: 147 Pod is not ready for scheduling. > Warning FailedScheduling 3m5s yunikorn 0/147 nodes are > available: 147 Pod is not ready for scheduling. > Normal Scheduling 3m3s yunikorn > yunikorn/dask-user-07ff5f3b-8qjkl8 is queued and waiting for allocation > Normal TriggeredScaleUp 2m53s cluster-autoscaler pod triggered > scale-up: > [{eksctl-cluster-nodegroup-spot-xlarge-compute-1-NodeGroup-8VURTD4WKCYV 0->4 > (max: 16)}] > {code} > So eventually, EKS auto-added some hosts but not actually been used and > allocated as the pods are not approved to be scheduled yet. > We also tried Gang scheduling with the pods in a task group, but it is also > having similar issues: Even the whole gang is not ready to schedule, Yunikorn > creates the place-holder pods which triggers auto-scaling of EKS cluster > *Causes and potential solutions* > We tried to look at both source code in the auto-scaler and Yunikorn, and we > think the reason is just that the auto-scaler does not know about Yunikorn > specific events and state (Pending but not QuotaApproved) of a Pod. It > searches all the Pods with `PodScheduled=False` to then check whether it > needs to add resources for them. > The issue could be resolved from both side: > - To solve from auto-scaler side, it needs to know the special events and > state of Yunikorn > - To solve from Yunikorn side, I think it needs to not create the pod or at > least not in `Pending` phase until it is quota approved > ** not sure how hard to achieve this, but as long as a pod is created and it > goes to Pending then auto-scaler will try to pick it up > We think solving it from Yunikron side would be cleaner, since auto-scaler > should not need to know the k8s scheduler implementation in order to make a > decision. Also there are other auto-scaler alternatives like AWS Karpenter > could suffers the same issue when interact with Yunikorn. > Wondering whether this issue report make sense to you guys. Let us know if > there are any other solutions and whether it is possible to be solved in > future :) > Thanks a lot! > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org