[jira] [Reopened] (FLINK-31502) Limit the number of concurrent scale operations to reduce cluster churn

Maximilian Michels (Jira) Thu, 07 Dec 2023 06:30:04 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-31502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Maximilian Michels reopened FLINK-31502:
----------------------------------------

Reopening because this is an actual issue.

> Limit the number of concurrent scale operations to reduce cluster churn
> -----------------------------------------------------------------------
>
>                 Key: FLINK-31502
>                 URL: https://issues.apache.org/jira/browse/FLINK-31502
>             Project: Flink
>          Issue Type: Improvement
>          Components: Autoscaler, Kubernetes Operator
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: kubernetes-operator-1.5.0
>
>
> Until we move to using the upcoming Rescale API which recycles pods, we need 
> to be mindful with how many deployments we scale at the same time because 
> each of them is going to give up all its pods and require the new number of 
> required pods. 
> This can cause churn in the cluster and temporary lead to "unallocatable" 
> pods which triggers the k8s cluster autoscaler to add more cluster nodes. 
> That is often not desirable because the actual required resources after the 
> scaling have been settled, are lower.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (FLINK-31502) Limit the number of concurrent scale operations to reduce cluster churn

Reply via email to