[ 
https://issues.apache.org/jira/browse/YUNIKORN-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2866:
-----------------------------------
    Target Version: 1.7.0  (was: 1.6.0, 1.7.0)

> [UMBRELLA] Support InPlacePodVerticalScaling (phase 2)
> ------------------------------------------------------
>
>                 Key: YUNIKORN-2866
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2866
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: core - scheduler, shim - kubernetes
>            Reporter: Craig Condit
>            Assignee: Craig Condit
>            Priority: Major
>
> Kubernetes 1.27 added a new [InPlacePodVerticalScaling|http://example.com/] 
> feature. While this is currently still in an alpha state as of 1.30 (and 
> therefore requires a feature flag to enable), it will likely be moved to beta 
> in 1.32, meaning it will be enabled by default, and considered stable in an 
> upcoming release. The implementation of this feature has implications for 
> YuniKorn, as with the feature enabled, the requests and limits of a Pod are 
> no longer immutable.
> Fortunately, the updated API objects that enable the feature contain the new 
> fields so we can add initial support for the feature now. To enable the 
> feature for testing in a Kind cluster, the kind cluster configuration needs 
> to contain the following:
> {noformat}
> kind: Cluster
> apiVersion: kind.x-k8s.io/v1alpha4
> featureGates:
>   "InPlacePodVerticalScaling": true{noformat}
> During scheduling of new pods, the requested resources are still used as 
> before.
> However, once a pod has been started, the actual resource utilization needs 
> to be tracked via a new {{Pod.Status.ContainerStatuses[].AllocatedResources}} 
> collection. In addition, if the value of {{Pod.Status.Resize}} is set to 
> {{{}Proposed{}}}, the usage of each container needs to be computed as the 
> maximum of its requested and allocated resources. The requested resources 
> field becomes mutable once this feature is turned on, and it represents the 
> latest *requested* (not actual) usage of the container.
> Supporting this feature is not optional within YuniKorn, as failure to 
> process the updated resources will mean that we do not account for resource 
> usage correctly if a pod is updated.
> Several steps will need to be taken to support this properly:
>  * Ensure that GetPodResources() accurately computes the effective usage of 
> the Pod in all cases. Since the AllocatedResources field will not be 
> populated when this feature is not active, and is only set once the pod is in 
> a running statue, this is fairly straightforward and can be implemented even 
> in clusters which do not have this feature enabled.
>  * The result of GetPodResources() will need to be cached in the shim so that 
> we can detect resource changes on Pod updates. Comparing the result of 
> GetPodResources() on the new Pod vs. the existing version will allow us to 
> easily detect changes.
>  * If changes are detected to a running YuniKorn-managed pod, an update 
> message will need to be sent from the core to change the resources of the 
> allocated task. 
>  * If changes are detected to a running non-Yunikorn-managed pod, and update 
> of the node utilized resources will need to be sent from the shim to the core.
>  * The core *must not* reject these updates, even if they would cause a queue 
> to go over capacity. Instead, they must be applied to the appropriate ask or 
> allocation unconditionally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org

Reply via email to