The suggestion of Zakelly (Using noDefaultValue to make different StateBackend have its own default value) makes sense to me. I haven't gotten a better idea about it. Maybe @ro...@apache.org <ro...@apache.org> who is the owner of incremental checkpoint support of HashMapStateBackend could share more ideas about it.
On Sat, Jun 25, 2022 at 2:38 PM Lihe Ma <ma_l...@163.com> wrote: > Hi, Yuan Mei, > > > Thanks for your feedback. > > The configuration state.backend.incremental should work for all state > backends, which I think is reasonable. If we introduce more > configurations(such as state.backend.hashmap.incremental, > state.backend.rocksdb.incremental), it does not bring enough benefit , and > users will be confused by the relationship of these configurations as > state.backend.incremental has been widely used for a long time. > > > If you prefer to reuse existing HashMapStateBackend to support the newly > incremental checkpoint feature, to be honest, I did not come up with an > idea to make everyone satisfied without introducing more configurations. It > seems that you have some ideas in your last reply, would you like to share > something here? Otherwise, I think we have to postpone the change to next > releases and let the beta feature of incremental checkpoint hashmap > state-backend merged first in the flink-1.16 release, even though many guys > show big +1 for the change of state.backend.incremental in this discussion. > > > Best, > Lihe Ma > > > > > > > > > > > > > > > > At 2022-06-19 20:39:45, "Yuan Mei" <yuanmei.w...@gmail.com> wrote: > >Hey @Lihe Ma, > > > >Thanks and I appreciate your feedback! > > > >*What do I think of introducing a new state backend?* > >I am hesitant to follow this way of introducing a new state backend to > >implement `HashMapStateBackend with incremental checkpoint enabled`, > >because: > >- It causes more confusion for users: why incremental checkpoint can > >be *enabled > >through a flag for RocksDB* but having to *write a new Statebackend for > >HashMap*? > >- For the same reason above, this causes *inconsistent system behavior* > and > >potentially lead to more problems. > > For example, do we want to eventually keep both state backends > (HashMap > >and the newly introduced one) or deprecate one? > > How to deprecate one if both are widely used in prod, especially since > >people can set up state backends in the job code, which means they have to > >change/update their job code for depreciation? > >- Most importantly, I do not think introducing a new state backend to > >implement `HashMapStateBackend with incremental checkpoint enabled` > >resolves the problem. I will explain that later. > > > >Now let's come back to the original proposal > >*Make incremental checkpoint as default for RocksDB?* > >As I've already stated previously, I am +1 on this. And the reason is well > >explained (the motivation, backward compatibilities, and previous > glitches > >solved by FLIP 193 e.t.c) in the original proposal and other > >threads/tickets. > > > >However, there is a gap between the above goal and the way proposed to > >achieve it: "set as the default state.backend.incremental = true". The way > >proposed is to make incremental checkpoint default universally for all > >state backends. > > > >*Should we make incremental checkpoints as default universally* *for all > >state backends?* > >I am not sure about this, or at least before a few questions are > >answered/discussed/solved. > > > >1. Is incremental checkpoints as default needed for all state backends? > > Probably not, but if one state backend (like RocksDB) has strong > >benefits, but others are neutral or non-neg effects, it is acceptable. > >2. What is the impact of the default flag switch on the currently > supported > >state backends? > > RocksDB prefers incremental checkpoints, and HashMap (In Memory) does > >not support incremental yet (no effect). > >3. How about system extensibility and future migration? > > There is a problem here. Nowadays, HashMap (In Memory) state > >backends are widely used in prod as well when the state size is small > >enough (in most cases based on user feedback). The support for incremental > >checkpoints on HashMapStatebackend is asked by users. Then think about > >this: if we set default state.backend.incremental = true, and launch the > >incremental checkpoint feature for HashMapStatebackend, there is a big > >upgrading cost for existing users that uses HashMapStatebackend. > > Introducing a new state backend does not solve the problem of "system > >extensibility". It introduces new problems like "deprecating and > migration" > >as stated above. > > > >*What is my suggestion?* > >- Design a way to support different default configurations for different > >state backends (I do not think it is difficult to do). > >- Have a better way to solve the "system extensibility" problem. > > > >Best, > >Yuan > > > > > >On Fri, Jun 17, 2022 at 11:49 AM LuNing Wang <wang4lun...@gmail.com> > wrote: > > > >> Strongly +1 > >> > >> Best, > >> LuNing Wang > >> > >> Zakelly Lan <zakelly....@gmail.com> 于2022年6月17日周五 00:15写道: > >> > >> > Thanks for bringing this up. > >> > I'm +1 on enabling the incremental checkpoint by default on RocksDB. > >> But I > >> > also agree with Yuan about not enabling this on newly implemented > >> > incremental checkpoint for hashmap statebackend. > >> > I am wondering can we make it behave differently for different state > >> > backends when ```state.backend.incremental``` is not set? Although it > may > >> > bring some confusion in the default behaviour. > >> > > >> > On Thu, Jun 16, 2022 at 10:18 PM Lihe Ma <ma_l...@163.com> wrote: > >> > > >> > > Thanks for the suggestion, Yuan. > >> > > > >> > > > >> > > How about naming the newly in-memory state-backend, which supports > >> > > incremental checkpoint, as HeapStateBackend . And let the default > >> > > state-backend still stay as HashMapStateBackend. > >> > > By doing so, we can: > >> > > 1) the default value of parameter state.backend.incremental could be > >> set > >> > > as true safely, which is so widely accepted according to the > replies in > >> > > this thread. > >> > > 2) your newly in-memory state-backend could also be merged and users > >> who > >> > > want to try in-memory incremental checkpoints could also have a > >> solution. > >> > > We can also set the newly HeapStateBackend as default state-backend > in > >> > the > >> > > future if the feature of incremental checkpoint is stable. What do > you > >> > > think? > >> > > > >> > > > >> > > Best, > >> > > Lihe Ma > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > 在 2022-06-15 21:03:31,"Feifan Wang" <zoltar9...@163.com> 写道: > >> > > >Thanks for bringing this up. > >> > > >Strongly +1 > >> > > > > >> > > > > >> > > > > >> > > > > >> > > >—————————————— > >> > > >Name: Feifan Wang > >> > > >Email: zoltar9...@163.com > >> > > > > >> > > > > >> > > >---- Replied Message ---- > >> > > >| From | Yuan Mei<yuanmei.w...@gmail.com> | > >> > > >| Date | 06/15/2022 11:41 | > >> > > >| To | dev<dev@flink.apache.org> , > >> > > ><ro...@ververica.com> | > >> > > >| Subject | Re: [DISCUSS ] Make state.backend.incremental as true > by > >> > > default | > >> > > >Thanks for bringing this up. > >> > > > > >> > > >I am +1 on making incremental checkpoints by default for RocksDB, > but > >> > not > >> > > >universally for all state backends. > >> > > > > >> > > >Besides being widely used in prod, enabling incremental checkpoint > for > >> > > >RocksDB by default is also a pre-requisite when enabling > task-local by > >> > > >default FLINK-15507 < > >> https://issues.apache.org/jira/browse/FLINK-15507> > >> > > > > >> > > >The incremental checkpoint for the hashmap statebackend is under > >> review > >> > > >right now. CC @ro...@ververica.com <ro...@ververica.com> , which > is > >> > not a > >> > > >good idea being enabled by default in the first version. > >> > > > > >> > > >Best, > >> > > > > >> > > >Yuan > >> > > > > >> > > >On Tue, Jun 14, 2022 at 7:33 PM Jiangang Liu < > >> liujiangangp...@gmail.com > >> > > > >> > > >wrote: > >> > > > > >> > > >+1 for the suggestion. We have use the incremental checkpoint in > our > >> > > >production for a long time. > >> > > > > >> > > >Hangxiang Yu <master...@gmail.com> 于2022年6月14日周二 15:41写道: > >> > > > > >> > > >+1 > >> > > >It's basically enabled in most scenarios in production > environments. > >> > > >For HashMapStateBackend, it will adopt a full checkpoint even if we > >> > > >enable > >> > > >incremental checkpoint. It will also support incremental checkpoint > >> > after > >> > > >[1]. It's compatible. > >> > > >BTW, I think we may also need to improve the documentation of > >> > incremental > >> > > >checkpoints which users usually ask. There are some tickets like > >> [2][3]. > >> > > > > >> > > >Best, > >> > > >Hangxiang. > >> > > > > >> > > >[1] https://issues.apache.org/jira/browse/FLINK-21648 > >> > > >[2] https://issues.apache.org/jira/browse/FLINK-22797 > >> > > >[3] https://issues.apache.org/jira/browse/FLINK-7449 > >> > > > > >> > > >On Mon, Jun 13, 2022 at 7:48 PM Rui Fan <1996fan...@gmail.com> > wrote: > >> > > > > >> > > >Strongly +1 > >> > > > > >> > > >Best, > >> > > >Rui Fan > >> > > > > >> > > >On Mon, Jun 13, 2022 at 7:35 PM Martijn Visser < > >> > > >martijnvis...@apache.org > >> > > > > >> > > >wrote: > >> > > > > >> > > >BTW, from my knowledge, nothing would happen for > >> > > >HashMapStateBackend, > >> > > >which does not support incremental checkpoint yet, when enabling > >> > > >incremental checkpoints. > >> > > > > >> > > >Thanks Yun, if no errors would occur then definitely +1 to enable > it > >> > > >by > >> > > >default > >> > > > > >> > > >Op ma 13 jun. 2022 om 12:42 schreef Alexander Fedulov < > >> > > >alexan...@ververica.com>: > >> > > > > >> > > >+1 > >> > > > > >> > > >From my experience, it is actually hard to come up with use cases > >> > > >where > >> > > >incremental checkpoints should explicitly not be enabled with the > >> > > >RocksDB > >> > > >state backend. If the state is so small that the full snapshots do > >> > > >not > >> > > >have any negative impact, one should consider using > >> > > >HashMapStateBackend > >> > > >anyway. > >> > > > > >> > > >Best, > >> > > >Alexander Fedulov > >> > > > > >> > > > > >> > > >On Mon, Jun 13, 2022 at 12:26 PM Jing Ge <j...@ververica.com> > >> > > >wrote: > >> > > > > >> > > >+1 > >> > > > > >> > > >Glad to see the kickoff of this discussion. Thanks Lihe for > >> > > >driving > >> > > >this! > >> > > > > >> > > >We have actually already discussed it internally a few months > >> > > >ago. > >> > > >After > >> > > >considering some corner cases, all agreed on enabling the > >> > > >incremental > >> > > >checkpoint as default. > >> > > > > >> > > >Best regards, > >> > > >Jing > >> > > > > >> > > >On Mon, Jun 13, 2022 at 12:17 PM Yun Tang <myas...@live.com> > >> > > >wrote: > >> > > > > >> > > >Strongly +1 for making incremental checkpoints as default. Many > >> > > >users > >> > > >have > >> > > >ever been asking why this configuration is not enabled by > >> > > >default. > >> > > > > >> > > >BTW, from my knowledge, nothing would happen for > >> > > >HashMapStateBackend, > >> > > >which does not support incremental checkpoint yet, when > >> > > >enabling > >> > > >incremental checkpoints. > >> > > > > >> > > > > >> > > >Best > >> > > >Yun Tang > >> > > >________________________________ > >> > > >From: Martijn Visser <martijnvis...@apache.org> > >> > > >Sent: Monday, June 13, 2022 18:05 > >> > > >To: dev@flink.apache.org <dev@flink.apache.org> > >> > > >Subject: Re: [DISCUSS ] Make state.backend.incremental as true > >> > > >by > >> > > >default > >> > > > > >> > > >Hi Lihe, > >> > > > > >> > > >What happens if we enable incremental checkpoints by default > >> > > >while > >> > > >the > >> > > >used > >> > > >memory backend is HashMapStateBackend, which doesn't support > >> > > >incremental > >> > > >checkpoints? > >> > > > > >> > > >Best regards, > >> > > > > >> > > >Martijn > >> > > > > >> > > >Op ma 13 jun. 2022 om 11:59 schreef Lihe Ma <ma_l...@163.com>: > >> > > > > >> > > >Hi, Everyone, > >> > > > > >> > > >I would like to open a discussion on setting incremental > >> > > >checkpoint > >> > > >as > >> > > >default behavior. > >> > > > > >> > > >Currently, the configuration of state.backend.incremental is > >> > > >set > >> > > >as > >> > > >false > >> > > >by default. Incremental checkpoint has been adopted widely in > >> > > >industry > >> > > >community for many years , and it is also well-tested from > >> > > >the > >> > > >feedback > >> > > >in > >> > > >the community discussion. Incremental checkpointing is more > >> > > >light-weighted: > >> > > >shorter checkpoint duration, less uploaded data and less > >> > > >resource > >> > > >consumption. > >> > > > > >> > > >In terms of backward compatibility, enable incremental > >> > > >checkpointing > >> > > >would > >> > > >not make any data loss no matter restoring from a full > >> > > >checkpoint/savepoint > >> > > >or an incremental checkpoint. > >> > > > > >> > > >FLIP-193 (Snapshot ownership)[1] has been released in 1.15, > >> > > >incremental > >> > > >checkpoint no longer depends on a previous restored > >> > > >checkpoint > >> > > >in > >> > > >default > >> > > >NO_CLAIM mode, which makes the checkpoint lineage much > >> > > >cleaner, > >> > > >it > >> > > >is a > >> > > >good chance to change the configuration > >> > > >state.backend.incremental > >> > > >to > >> > > >true > >> > > >as default. > >> > > > > >> > > >Thus, based on the above discussion, I suggest to make > >> > > >state.backend.incremental as true by default. What do you > >> > > >think > >> > > >of > >> > > >this > >> > > >proposal? > >> > > > > >> > > >[1] > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership > >> > > > > >> > > >Best regards, > >> > > >Lihe Ma > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > >> >