liang jie created FLINK-32887: --------------------------------- Summary: SourceCoordinatorContext#workerExecutor may cause task initializing slowly Key: FLINK-32887 URL: https://issues.apache.org/jira/browse/FLINK-32887 Project: Flink Issue Type: Improvement Components: Connectors / Common, Runtime / Coordination Affects Versions: 1.15.2 Reporter: liang jie
SourceCoordinatorContext#workerExecutor is typically used to calculate partitions of a source task and is implemented by a ScheduledExecutorService with only 1 core (hard coded).Tasks to calculate partitions with be executed through the function 'workerExecutor.scheduleAtFixedRate'. --- In some case, for example, 'getSubscribedPartitions' method will take quite a long time(e.g. 5min) because of lots of topics are included in the same task or requests to outer systems timeout etc. And partitionDiscoveryInterval is set to a short intervel e.g. 1min. In this case, 'getSubscribedPartitions' runnable tasks will be triggered repeatedly and be queued in the queue of workerExecutor, during the first 'getSubscribedPartitions' task running duration, which causing 'checkPartitionChanges' tasks will be queued too. Each 'checkPartitionChanges' task needs to wait for 25mins(5 * 'getSubscribedPartitions' task execution duration) before it was executed. --- In my view, tasks of workerExecutor should be scheduled with fix deley rather than at fixed rate. Because there is no meaning that 'getSubscribedPartitions' tasks being repeatedly executed without a 'checkPartitionChanges' execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)