liang jie created FLINK-32887:
---------------------------------
Summary: SourceCoordinatorContext#workerExecutor may cause task
initializing slowly
Key: FLINK-32887
URL: https://issues.apache.org/jira/browse/FLINK-32887
Project: Flink
Issue Type: Improvement
Components: Connectors / Common, Runtime / Coordination
Affects Versions: 1.15.2
Reporter: liang jie
SourceCoordinatorContext#workerExecutor is typically used to calculate
partitions of a source task and is implemented by a ScheduledExecutorService
with only 1 core (hard coded).Tasks to calculate partitions with be executed
through the function 'workerExecutor.scheduleAtFixedRate'.
---
In some case, for example, 'getSubscribedPartitions' method will take quite a
long time(e.g. 5min) because of lots of topics are included in the same task or
requests to outer systems timeout etc. And partitionDiscoveryInterval is set to
a short intervel e.g. 1min.
In this case, 'getSubscribedPartitions' runnable tasks will be triggered
repeatedly and be queued in the queue of workerExecutor, during the first
'getSubscribedPartitions' task running duration, which causing
'checkPartitionChanges' tasks will be queued too. Each 'checkPartitionChanges'
task needs to wait for 25mins(5 * 'getSubscribedPartitions' task execution
duration) before it was executed.
---
In my view, tasks of workerExecutor should be scheduled with fix deley rather
than at fixed rate. Because there is no meaning that 'getSubscribedPartitions'
tasks being repeatedly executed without a 'checkPartitionChanges' execution.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)