Sounds like a nice feature to have. Eagerly looking forward for the RFC. On Sat, 27 Aug 2022 at 20:51, 冯健 <[email protected]> wrote:
> I attached the image in this Jira Epic > https://issues.apache.org/jira/browse/HUDI-4677, and the RFC is WIP, will > create a pr in the next few days > Yeah, the basic idea is to implement lifecycle management based on the > savepoint and time travel features, providing new ways for the user to > operate > and coordinate. won't propose any new concept > > On Sun, 28 Aug 2022 at 02:06, Shiyan Xu <[email protected]> > wrote: > > > The dev email list does not support showing images unfortunately. you may > > want to put it behind a link. > > > > As for the idea itself, > > > > What I plan to do is to let Hudi support release a snapshot view and > > > lifecycle management out-of-box. > > > > > > Are you planning to extend the savepoint feature to have lifecycle mgmt > > capabilities? We should consolidate overlapping features properly. > > > > On Sun, Aug 21, 2022 at 12:59 PM 冯健 <[email protected]> wrote: > > > > > Hi team, > > > [image: image.png] > > > for the snapshot view scenario, Hudi already provides two key > > > features to support it: > > > > > > - Time travel: user provides a timestamp to query a specific > snapshot > > > view of a Hudi table > > > - Savepoint/restore: "savepoint" saves the table as of the commit > time > > > so that it lets you restore the table to this savepoint at a later > > point in > > > time if need be. but in this case, the user usually uses this to > > prevent > > > cleaning snapshot view at a specific timestamp, only clean unused > > files > > > > > > The situation is there some inconvenience for users if use them > directly > > > > > > - Usually users incline to use a meaningful name instead of querying > > > Hudi table with a timestamp, using the timestamp in SQL may lead to > > the > > > wrong snapshot view being used. for example, we can announce that a > > new tag > > > of hudi table with table_nameYYYYMMDD was released, then the user > can > > use > > > this new table name to query. > > > - Savepoint is not designed for this "snapshot view" scenario in the > > > beginning, it is designed for disaster recovery. let's say a new > > snapshot > > > view will be created every day, and it has 7 days retention, we > should > > > support lifecycle management on top of it. > > > > > > What I plan to do is to let Hudi support release a snapshot view and > > > lifecycle management out-of-box. We have already done some work when > > > supporting customers' snapshot view requirements in my company, and > hope > > to > > > land this feature in Community too. > > > > > > Please feel free to let me know if you have any idea about this. > > > > > > Thanks, > > > > > > Jian Feng > > > > > > > > > -- > > Best, > > Shiyan > > > -- Regards, -Sivabalan
