Re: Checkpoint fail due to timeout

Alexey Trenikhun Wed, 17 Mar 2021 08:58:23 -0700

According to [1] checkpoints do not support Flink specific features like 
rescaling, but I can try. Thank you for suggestions

[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#difference-to-savepoints
Apache Flink 1.12 Documentation: 
Checkpoints<https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#difference-to-savepoints>
Configure globally via configuration files state.checkpoints.dir: 
hdfs:///checkpoints/ Configure for per job when constructing the state backend 
env. setStateBackend (new RocksDBStateBackend ("hdfs:///checkpoints-data/")); 
Difference to Savepoints
ci.apache.org

________________________________
From: ChangZhuo Chen (陳昌倬)
Sent: Wednesday, March 17, 2021 12:29 AM
To: Alexey Trenikhun
Cc: ro...@apache.org; Flink User Mail List
Subject: Re: Checkpoint fail due to timeout

On Wed, Mar 17, 2021 at 05:45:38AM +0000, Alexey Trenikhun wrote:
> In my opinion looks similar. Were you able to tune-up Flink to make it work? 
> I'm stuck with it, I wanted to scale up hoping to reduce backpressure, but to 
> rescale I need to take savepoint, which never completes (at least takes 
> longer than 3 hours).

You can use aligned checkpoint to scala your job. Just restarting from
checkpoint with the same jar file, and new parallelism shall do the
trick.

--
ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org
http://czchen.info/
Key fingerprint = BA04 346D C2E1 FE63 C790  8793 CC65 B0CD EC27 5D5B

Re: Checkpoint fail due to timeout

Reply via email to