[jira] [Commented] (FLINK-33940) Update the auto-derivation rule of max parallelism for enlarged upscaling space

Maximilian Michels (Jira) Tue, 02 Jan 2024 08:24:30 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-33940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17801843#comment-17801843
 ]


Maximilian Michels commented on FLINK-33940:
--------------------------------------------

[~Zhanghao Chen] Even though the factor only affects high parallelism operators 
> 840, I wonder whether we need to leave more room for scaleup. But I don't 
have a strong opinion.

{quote}
IIUC, when the parallelism of one job is very small(it's 1 or 2) and the max 
parallelism is 1024, one subtask will have 1024 keyGroups. From state backend 
side, too many key groups may effect the performance. (This is my concern to 
change it by default in Flink Community.)
{quote}

[~fanrui] I think we need to find out how big the performance impact actually 
is when jumping from 128 to 840 key groups. But 128 may just have been a very 
conservative number.

> Update the auto-derivation rule of max parallelism for enlarged upscaling 
> space
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-33940
>                 URL: https://issues.apache.org/jira/browse/FLINK-33940
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / Core
>            Reporter: Zhanghao Chen
>            Assignee: Zhanghao Chen
>            Priority: Major
>
> *Background*
> The choice of the max parallelism of an stateful operator is important as it 
> limits the upper bound of the parallelism of the opeartor while it can also 
> add extra overhead when being set too large. Currently, the max parallelism 
> of an opeartor is either fixed to a value specified by API core / pipeline 
> option or auto-derived with the following rules:
> {{min(max(roundUpToPowerOfTwo(operatorParallelism * 1.5), 128), 32767)}}
> *Problem*
> Recently, the elasticity of Flink jobs is becoming more and more valued by 
> users. The current auto-derived max parallelism was introduced a time time 
> ago and only allows the operator parallelism to be roughly doubled, which is 
> not desired for elasticity. Setting an max parallelism manually may not be 
> desired as well: users may not have the sufficient expertise to select a good 
> max-parallelism value.
> *Proposal*
> Update the auto-derivation rule of max parallelism to derive larger max 
> parallelism for better elasticity experience out of the box. A candidate is 
> as follows:
> {{min(max(roundUpToPowerOfTwo(operatorParallelism * {*}5{*}), {*}1024{*}), 
> 32767)}}
> Looking forward to your opinions on this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33940) Update the auto-derivation rule of max parallelism for enlarged upscaling space

Reply via email to