ZhangWei created FLINK-22049:
--------------------------------

             Summary: Simplify calculating the starting index of the local 
key-group range in InternalTimerServiceImpl constructor.
                 Key: FLINK-22049
                 URL: https://issues.apache.org/jira/browse/FLINK-22049
             Project: Flink
          Issue Type: Improvement
          Components: API / DataStream
            Reporter: ZhangWei


For org.apache.flink.streaming.api.operators.InternalTimerServiceImpl;

in current implementation, when we try to find the starting index of the local 
key-group range, we iterate over the KeyGroupRange and try to find the min 
value.

while this is unnecessary and wasteful, because the KeyGroupRange is often 
monotonically increasing and we can just get the startKeyGroup which is what we 
want.

It is even worse when we set a very large max parallelism(eg: 10k level) and a 
small parallelism(eg: 10 level), thus a KeyGroupRange may contain about 1k 
index in it. And then we iterate this 1k increasing numbers to find the min 
value.

so if the KeyGroupRange is not an EMPTY_KEY_GROUP_RANGE(i.e. 
KeyGroupRange#getNumberOfKeyGroups() > 0), we can just get the startKeyGroup.

if KeyGroupRange is just an EMPTY_KEY_GROUP_RANGE, use the Integer.MAX_VALUE.

I think this can avoid much unnecessary operation and save time.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to