gyfora commented on code in PR #724: URL: https://github.com/apache/flink-kubernetes-operator/pull/724#discussion_r1418985847
########## docs/content/docs/custom-resource/job-management.md: ########## @@ -288,12 +288,33 @@ Rollback is currently only supported for `FlinkDeployments`. ## Manual Recovery -There are cases when manual intervention is required from the user to recover a Flink application deployment. +There are cases when manual intervention is required from the user to recover a Flink application deployment or to restore to a user specified state. In most of these situations the main reason for this is that the deployment got into a state where the operator cannot determine the health of the application or the latest checkpoint information to be used for recovery. While these cases are not common, we need to be prepared to handle them. -Fortunately almost any issue can be recovered by the user manually by using the following steps: +Users have two options to restore a job from a target savepoint / checkpoint + +### Redeploy using the savepointRedeployNonce + +It is possible to redeploy a `FlinkDeployment` or `FlinkSessionJob` resource from a target savepoint by using the combination of `savepointRedeployNonce` and `initialSavepointPath` in the job spec: + +```yaml + job: + initialSavepointPath: file://redeploy-target-savepoint + savepointRedeployNonce: null -> 1 Review Comment: alright, I will update it to avoid unexpected copy-paste behaviour -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org