[ https://issues.apache.org/jira/browse/FLINK-34576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824041#comment-17824041 ]
Gyula Fora commented on FLINK-34576: ------------------------------------ I am happy to review your PR if you try to bump the version > Flink deployment keep staying at RECONCILING/STABLE status > ---------------------------------------------------------- > > Key: FLINK-34576 > URL: https://issues.apache.org/jira/browse/FLINK-34576 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator > Affects Versions: kubernetes-operator-1.6.1 > Reporter: chenyuzhi > Priority: Major > Attachments: image-2024-03-05-15-13-11-032.png > > > The HA mode of flink-kubernetes-operator is being used. When one of the pods > of flink-kubernetes-operator restarts, flink-kubernetes-operator switches the > leader. However, some flinkdeployments have been in the > *JOB_STATUS=RECONCILING&LIFECYCLE_STATE=STABLE* state for a long time. > Through the cmd "kubectl describe flinkdeployment xxx", can see the following > error, but there are no exceptions in the flink-kubernetes-operator log. > > {code:java} > Status: > Cluster Info: > Flink - Revision: b6d20ed @ 2023-12-20T10:01:39+01:00 > Flink - Version: 1.14.0-GDC1.6.0 > Total - Cpu: 7.0 > Total - Memory: 30064771072 > Error: > {"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.RuntimeException: Failed to load > configuration","additionalMetadata":{},"throwableList":[{"type":"org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException","message":"java.lang.RuntimeException: > Failed to load > configuration","additionalMetadata":{}},{"type":"java.lang.RuntimeException","message":"Failed > to load configuration","additionalMetadata":{}}]} > Job Manager Deployment Status: READY > Job Status: > Job Id: cf44b5e73a1f263dd7d9f2c82be5216d > Job Name: noah_stream_studio_1754211682_2218100380 > Savepoint Info: > Last Periodic Savepoint Timestamp: 0 > Savepoint History: > Start Time: 1705635107137 > State: RECONCILING > Update Time: 1709272530741 > Lifecycle State: STABLE {code} > > !image-2024-03-05-15-13-11-032.png! > > version: > flink-kubernetes-operator: 1.6.1 > flink: 1.14.0/1.15.2 (flinkdeployment 1200+) > -- This message was sent by Atlassian Jira (v8.20.10#820010)