On Fri, Jan 28, 2022 at 02:43:11PM +0800, Caizhi Weng wrote:
> Chen-Che Huang <acmic...@gmail.com> 于2022年1月27日周四 11:10写道:
> > We have two questions for checkpoint retention.
> >
> >    1. When our cron job creates a savepoint called SP, it seems those
> >    checkpoints created earlier SP still cannot be deleted. We thought the 
> > new
> >    checkpoints are generated based on SP and thus old checkpoints before SP
> >    will be useless. However, it seems the checkpoint mechanism doesn't work 
> > as
> >    we thought. Is what we thought correct?
> >    2. To save storage cost, we’d like to know what checkpoints can be
> >    deleted. Currently, each version of our app has 10 checkpoints. We wonder
> >    whether we can delete checkpoints generated for previous versions of our
> >    apps?

Some details below:

* We have two GCS buckets to store checkpoints and savepoints, like the
  following:

  * gs://flink-checkpoints has no retention configuration.
  * gs://flink-savepoints has retention 5 days.

* The checkpoint configuration are:
  * state.backend.incremental: true
  * RETAIN_ON_CANCELLATION
* We create savepoint every 4 hours for recovery.
* The business requires to have up to 180 days historical data.

The questions are:

* We want to set retention on gs://flink-checkpoints to reduce storage
  cost. However, Flink sometimes cannot restore from checkpoint due to
  missing data when retention is configured on gs://flink-checkpoints.
  Is there any way to config retention safely for Flink?

* We don't use DELETE_ON_CANCELLATION to avoid deleting state data by
  accidently.


-- 
ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org
http://czchen.info/
Key fingerprint = BA04 346D C2E1 FE63 C790  8793 CC65 B0CD EC27 5D5B

Attachment: signature.asc
Description: PGP signature

Reply via email to