Zhanghao Chen created FLINK-31245: ------------------------------------- Summary: Adaptive scheduler does not reset the state of GlobalAggregateManager when rescaling Key: FLINK-31245 URL: https://issues.apache.org/jira/browse/FLINK-31245 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.16.1 Reporter: Zhanghao Chen Fix For: 1.18.0
*Problem* GlobalAggregateManager is used to share state amongst parallel tasks in a job and thus coordinate their execution. It maintains a state (the _accumulators_ field in JobMaster) in JM memory. The accumulator state content is defined in user code, in my company, a user stores task parallelism in the accumulator, assuming task parallelism never changes. However, this assumption is broken when using adaptive scheduler. *Possible Solutions* # Mark GlobalAggregateManager as deprecated. It seems that operator coordinator can completely replace GlobalAggregateManager and is a more elegent solution. Therefore, it is fine to deprecate GlobalAggregateManager and leave this issue there. It that's the case, we can open another ticket for doing that. # If we decide to continue supporting GlobalAggregateManager, then we need to reset the state when rescaling. -- This message was sent by Atlassian Jira (v8.20.10#820010)