Hi Flink Community, My name is Tony Chen, and I am a software engineer at Robinhood. I have some questions on restarting a Flink Application from a savepoint or checkpoint.
We currently store our checkpoints and savepoints in S3, and we would like to use the Apache Flink Kubernetes Operator to manage our Flink applications. I know that there is a field called "initialSavepointPath" ( doc <https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/job-management/#manual-recovery>) that I can set in my kubernetes manifest so that whenever I want my Flink application to start from a particular savepoint, it will start from the savepoint directory in this field. However, if I delete this FlinkDeployment resource altogether after new savepoints were triggered, and then redeploy this FlinkDeployment resource, it looks like I have to manually update the "initialSavepointPath" to a newer savepoint directory so that the Flink application starts from a newer savepoint. Is there a way for us to redeploy FlinkDeployment resources so that the latest checkpoint or savepoint is used, and without having to update the "initialSavepointPath" field? I noticed in my testing that whenever I deleted the FlinkDeployment resource and redeploy, it would either start from the savepoint in initialSavepointPath or from checkpoint 1 if initialSavepointPath was not set. For example, let's say I restarted a Flink application at savepoint 10 with initialSavepointPath set to s3://savepoints/savepoint-10, and then later on a savepoint 20 was completed and stored at s3://savepoints/savepoint-20. Is there a way for me to delete this FlinkDeployment and redeploy it without updating initialSavepointPath? Thanks, Tony P.S. I'm going through the source code more for Apache Flink Kubernetes Operator to understand how the operator starts a Flink job. Some relevant code: - https://github.com/apache/flink-kubernetes-operator/blob/0c341ebe13645f4e9802cfd780c5b50f59e29363/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L500 - https://github.com/apache/flink-kubernetes-operator/blob/0c341ebe13645f4e9802cfd780c5b50f59e29363/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/observer/SavepointObserver.java#L204 -- <http://www.robinhood.com/> Tony Chen Software Engineer Menlo Park, CA Don't copy, share, or use this email without permission. If you received it by accident, please let us know and then delete it right away.