Rui Fan created FLINK-34655: ------------------------------- Summary: Autoscaler doesn't work for flink 1.15 Key: FLINK-34655 URL: https://issues.apache.org/jira/browse/FLINK-34655 Project: Flink Issue Type: Bug Components: Autoscaler Reporter: Rui Fan Assignee: Rui Fan Fix For: 1.8.0
flink-ubernetes-operator is committed to supporting the latest 4 flink minor versions, and autoscaler is a part of flink-ubernetes-operator. Currently, the latest 4 flink minor versions are 1.15, 1.16, 1.17 and 1.18. But autoscaler doesn't work for flink 1.15. h2. Root cause: * FLINK-28310 added some properties in IOMetricsInfo in flink-1.16 * IOMetricsInfo is a part of JobDetailsInfo * JobDetailsInfo is necessary for autoscaler [1] * flink's RestClient doesn't allow miss any property during deserializing the json That means that the RestClient after 1.15 cannot fetch JobDetailsInfo for 1.15 jobs. h2. How to fix it properly? Flink side support ignore unknown properties. FLINK-33268 already do it. But I try run autoscaler with flink-1.15 job, it still doesn't work. Because the IOMetricsInfo added some properties, they are primitive type. It should disable DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES as well. (Not sure whether it should be a seperate FLIP or it can be a part of FLIP-401 [2].) h2. How to fix it in the short term? 1. Copy the latest RestMapperUtils and RestClient from master branch (It includes FLINK-33268) to flink-autoscaler module. (The copied class will be loaded first) 2. Disable DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES in RestMapperUtils#flexibleObjectMapper in copied class. Based on these 2 steps, flink-1.15 works well with autoscaler. (I try it locally). After DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES in RestMapperUtils#flexibleObjectMapper is disabled, and the corresponding code is released in flink side. flink-ubernetes-operator can remove these 2 copied classes. [1] https://github.com/apache/flink-kubernetes-operator/blob/ede1a610b3375d31a2e82287eec67ace70c4c8df/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/ScalingMetricCollector.java#L109 [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-401%3A+REST+API+JSON+response+deserialization+unknown+field+tolerance -- This message was sent by Atlassian Jira (v8.20.10#820010)