Thanks Zakelly for starting this discussion. Regardless of whether it is for users or developers, deprecating RestoreMode#LEGACY makes the semantics clearer and lower maintenance costs, and Flink 2.0 is a good time point to do this. So +1 for the overall idea.
Best, Yanfei Zakelly Lan <zakelly....@gmail.com> 于2024年1月11日周四 14:57写道: > > Hi devs, > > I'd like to start a discussion on FLIP-416: Deprecate and remove the > RestoreMode#LEGACY[1]. > > The FLIP-193[2] introduced two modes of state file ownership during > checkpoint restoration: RestoreMode#CLAIM and RestoreMode#NO_CLAIM. The > LEGACY mode, which was how Flink worked until 1.15, has been superseded by > NO_CLAIM as the default mode. The main drawback of LEGACY mode is that the > new job relies on artifacts from the old job without cleaning them up, > leaving users uncertain about when it is safe to delete the old checkpoint > directories. This leads to the accumulation of unnecessary checkpoint files > that are never cleaned up. Considering cluster availability and job > maintenance, it is not recommended to use LEGACY mode. Users could choose > the other two modes to get a clear semantic for the state file ownership. > > This FLIP proposes to deprecate the LEGACY mode and remove it completely in > the upcoming Flink 2.0. This will make the semantic clear as well as > eliminate many bugs caused by mode transitions involving LEGACY mode (e.g. > FLINK-27114 [3]) and enhance code maintainability. > > Looking forward to hearing from you! > > [1] https://cwiki.apache.org/confluence/x/ookkEQ > [2] https://cwiki.apache.org/confluence/x/bIyqCw > [3] https://issues.apache.org/jira/browse/FLINK-27114 > > Best, > Zakelly