[ 
https://issues.apache.org/jira/browse/FLINK-34655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17826015#comment-17826015
 ] 

Maximilian Michels commented on FLINK-34655:
--------------------------------------------

Thanks for raising awareness for the Flink version compatibility, [~fanrui]! 
Although we've been using Flink Autoscaling with 1.16, it is true that only 
Flink 1.17 supports it out of the box.
{quote}In the short term, we only use the autoscaler to give suggestion instead 
of scaling directly. After our users think the parallelism calculation is 
reliable, they will have stronger motivation to upgrade the flink version.
{quote}
I understand the idea behind providing suggestions. However, it is difficult to 
assess the quality of Autoscaling decisions without applying them 
automatically. The reason is that suggestions become stale very quickly if the 
load pattern is not completely static. Even for static load patterns, if the 
user doesn't redeploy in a matter of minutes, the suggestions might already be 
stale again when the number of pending records increased too much. In any case, 
production load patterns are rarely static which means that autoscaling will 
inevitable trigger multiple times a day, but that is where its real power is 
unleashed. It would be great to hear about any concerns your users have for 
turning on automatic scaling. We've been operating it in production for about a 
year now.

Back to the issue here, should we think about a patch release for 1.15 / 1.16 
to add support for overriding vertex parallelism?

> Autoscaler doesn't work for flink 1.15
> --------------------------------------
>
>                 Key: FLINK-34655
>                 URL: https://issues.apache.org/jira/browse/FLINK-34655
>             Project: Flink
>          Issue Type: Bug
>          Components: Autoscaler
>            Reporter: Rui Fan
>            Assignee: Rui Fan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: kubernetes-operator-1.8.0
>
>
> flink-ubernetes-operator is committed to supporting the latest 4 flink minor 
> versions, and autoscaler is a part of flink-ubernetes-operator. Currently,  
> the latest 4 flink minor versions are 1.15, 1.16, 1.17 and 1.18.
> But autoscaler doesn't work for  flink 1.15.
> h2. Root cause: 
> * FLINK-28310 added some properties in IOMetricsInfo in flink-1.16
> * IOMetricsInfo is a part of JobDetailsInfo
> * JobDetailsInfo is necessary for autoscaler [1]
> * flink's RestClient doesn't allow miss any property during deserializing the 
> json
> That means that the RestClient after 1.15 cannot fetch JobDetailsInfo for 
> 1.15 jobs.
> h2. How to fix it properly?
> - [[FLINK-34655](https://issues.apache.org/jira/browse/FLINK-34655)] Copy 
> IOMetricsInfo to flink-autoscaler-standalone module
> - Removing them after 1.15 are not supported
> [1] 
> https://github.com/apache/flink-kubernetes-operator/blob/ede1a610b3375d31a2e82287eec67ace70c4c8df/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/ScalingMetricCollector.java#L109
> [2] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-401%3A+REST+API+JSON+response+deserialization+unknown+field+tolerance



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to