Hi Phil, I think you can use the "-s :checkpointMetaDataPath" arg to resume the job from a retained checkpoint[1].
[1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/checkpoints/#resuming-from-a-retained-checkpoint Best, Jinzhong Li On Mon, May 20, 2024 at 2:29 AM Phil Stavridis <phi...@gmail.com> wrote: > Hi Lu, > > Thanks for your reply. In what way are the paths to get passed to the job > that needs to used the checkpoint? Is the standard way, using -s :/<path> > or by passing the path in the module as a Python arg? > > Kind regards > Phil > > > On 18 May 2024, at 03:19, jiadong.lu <archzi...@gmail.com> wrote: > > > > Hi Phil, > > > > AFAIK, the error indicated your path was incorrect. > > your should use '/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' or > 'file:///opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' instead. > > > > Best. > > Jiadong.Lu > > > > On 5/18/24 2:37 AM, Phil Stavridis wrote: > >> Hi, > >> I am trying to test how the checkpoints work for restoring state, but > not sure how to run a new instance of a flink job, after I have cancelled > it, using the checkpoints which I store in the filesystem of the job > manager, e.g. /opt/flink/checkpoints. > >> I have tried passing the checkpoint as an argument in the function and > use it while setting the checkpoint but it looks like the way it is done is > something like below: > >> docker-compose exec jobmanager flink run -s > :/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc -py > /opt/app/flink_job.py > >> But I am getting error: > >> Caused by: java.io.IOException: Checkpoint/savepoint path > ':/opt/flink/checkpoints/1875588e19b1d8709ee62be1cdcc' is not a valid file > URI. Either the pointer path is invalid, or the checkpoint was created by a > different state backend. > >> What is wrong with the way the job is re-submitted to the cluster? > >> Kind regards > >> Phil > >