Fei Feng created FLINK-34566:
--------------------------------

             Summary: Flink Kubernetes Operator reconciliation parallelism 
setting not work
                 Key: FLINK-34566
                 URL: https://issues.apache.org/jira/browse/FLINK-34566
             Project: Flink
          Issue Type: Bug
          Components: Kubernetes Operator
    Affects Versions: kubernetes-operator-1.7.0
            Reporter: Fei Feng
         Attachments: image-2024-03-04-10-58-37-679.png, 
image-2024-03-04-11-17-22-877.png

After upgrade JOSDK to version 4.4.2 from version 4.3.0 in FLINK-33005 , we can 
not enlarge reconciliation parallelism , and the maximum reconciliation 
parallelism was 10. This results FlinkDeployment and SessionJob 's 
reconciliation delay about 10-20 seconds where we have a large scale  flink 
session cluster and flink jobs。
 

After investigating and validating, I found the reason is the logic for 
reconciliation thread pool creation in JOSDK has changed significantly between 
this two version. 

v4.3.0: 
reconciliation thread pool was created as a FixedThreadPool ( maximumPoolSize 
was same as corePoolSize), so we pass the reconciliation thread and get a 
thread pool that matches our expectations.


!image-2024-03-04-10-58-37-679.png|width=628,height=115!

[https://github.com/operator-framework/java-operator-sdk/blob/v4.3.0/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationServiceOverrider.java#L198]

 

but in v4.2.0:

the reconciliation thread pool was created as a customer executor which we can 
pass corePoolSize and maximumPoolSize to create this thread pool.The problem is 
that we only set the maximumPoolSize of the thread pool, while, the 
corePoolSize of the thread pool is defaulted to 10. This causes thread pool 
size was only 10 and majority of events would be placed in the workQueue for a 
while.  

!image-2024-03-04-11-17-22-877.png|width=594,height=117!

https://github.com/operator-framework/java-operator-sdk/blob/v4.4.2/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ExecutorServiceManager.java#L37

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to