[ https://issues.apache.org/jira/browse/FLINK-29819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gyula Fora reassigned FLINK-29819: ---------------------------------- Assignee: Clara Xiong > Record an error event when savepoint fails within grace period > -------------------------------------------------------------- > > Key: FLINK-29819 > URL: https://issues.apache.org/jira/browse/FLINK-29819 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator > Reporter: Clara Xiong > Assignee: Clara Xiong > Priority: Major > Labels: pull-request-available > > As of now, SavepointObserver retries if savepoint fails within grace period > until success or failure happens after the grace period. The grace period is > for each retry. If underlying problem for quick failure is not transient, > such as a mis-configured path or a perisistent storage failure, retries keep > going on without recording any error event. > We should first add logic to record an error event per failed attempt. We can > consider capping the retries if it become a pain for users. > -- This message was sent by Atlassian Jira (v8.20.10#820010)