[ 
https://issues.apache.org/jira/browse/FLINK-33764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-33764.
------------------------------
    Resolution: Fixed

merged to main f6adb400e1c87f06faec948379c264eebba71166

> Incorporate GC / Heap metrics in autoscaler decisions
> -----------------------------------------------------
>
>                 Key: FLINK-33764
>                 URL: https://issues.apache.org/jira/browse/FLINK-33764
>             Project: Flink
>          Issue Type: New Feature
>          Components: Autoscaler, Kubernetes Operator
>            Reporter: Gyula Fora
>            Assignee: Gyula Fora
>            Priority: Major
>              Labels: pull-request-available
>
> The autoscaler currently doesn't use any GC/HEAP metrics as part of the 
> scaling decisions. 
> While the long term goal may be to support vertical scaling (increasing TM 
> sizes) currently this is out of scope for the autoscaler.
> However it is very important to detect cases where the throughput of certain 
> vertices or the entire pipeline is critically affected by long GC pauses. In 
> these cases the current autoscaler logic would wrongly assume a low true 
> processing rate and scale the pipeline too high, ramping up costs and causing 
> further issues.
> Using the improved GC metrics introduced in 
> https://issues.apache.org/jira/browse/FLINK-33318 we should measure the GC 
> pauses and simply block scaling decisions if the pipeline spends too much 
> time garbage collecting and notify the user about the required action to 
> increase memory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to