wangyang0918 commented on code in PR #20590: URL: https://github.com/apache/flink/pull/20590#discussion_r949073763
########## flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesStateHandleStore.java: ########## @@ -213,14 +214,26 @@ public RetrievableStateHandle<T> addAndLock(String key, T state) // initialize flag to serve the failure case boolean discardState = true; + final AtomicInteger retryNum = new AtomicInteger(0); try { // a successful operation will result in the state not being discarded discardState = !updateConfigMap( cm -> { + retryNum.incrementAndGet(); try { return addEntry(cm, key, serializedStoreHandle); } catch (Exception e) { + // It could happen the fabric8 k8s client retries a + // transaction that has already succeeded due to network + // issues. We let the AlreadyExistException caused by + // PossibleInconsistentStateException here to avoid + // discarding the state. + if (retryNum.get() > 1 + && e instanceof AlreadyExistException) { + e.initCause( Review Comment: If you are suggesting to simply break the retry since the `addEntry` executed successfully instead of throwing the `PossiblyInconsistentStateException`, I could also update the current implementation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org