Hi,

I think this was last discussed in FLIP-193 [1] where the reasoning is
mostly the same as Matthias said, savepoints are owned by the user and
Flink cannot depend on them.

In older versions Flink also took savepoints into account when restoring a
job, with the possibility to skip savepoints by using the config
"execution.checkpointing.prefer-checkpoint-for-recovery" introduced in
FLINK-11159 [2]. This option was later removed in FLINK-20427 [3] because
it could lead to loss of data. Since the introduction of FLIP-193 only
checkpoints are considered during recovery.

Best regards,
Mate

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership
[2] https://issues.apache.org/jira/browse/FLINK-11159
[3] https://issues.apache.org/jira/browse/FLINK-20427

Matthias Pohl <map...@apache.org> ezt írta (időpont: 2024. jún. 7., P,
8:55):

> One reason could be that the savepoints are self-contained, owned by the
> user rather than Flink and, therefore, could be moved. Flink wouldn't have
> a proper reference in that case anymore.
>
> I don't have a link to a discussion, though.
>
> Best,
> Matthias
>
> On Fri, Jun 7, 2024 at 8:47 AM Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Hey Devs!
> >
> > What is the reason / rationale for savepoints being ignored during
> failover
> > scenarios?
> >
> > I see they are not even recorded as the last valid checkpoint in the HA
> > metadata (only the checkpoint id counter is bumped) so if the JM fails
> > after a manually triggered savepoint the job will still fall back to the
> > previous checkpoint instead.
> >
> > I am sure there must have been some discussion around it but I cant find
> > it.
> >
> > Thank you!
> > Gyula
> >
>

Reply via email to