Sounds like a nice feature to have. Eagerly looking forward for the RFC.

On Sat, 27 Aug 2022 at 20:51, 冯健 <[email protected]> wrote:

> I attached the image in this Jira Epic
> https://issues.apache.org/jira/browse/HUDI-4677, and the RFC is WIP, will
> create a pr in the next few days
> Yeah, the basic idea is to implement lifecycle management based on the
> savepoint and time travel features, providing new ways for the user to
> operate
> and coordinate. won't propose any new concept
>
> On Sun, 28 Aug 2022 at 02:06, Shiyan Xu <[email protected]>
> wrote:
>
> > The dev email list does not support showing images unfortunately. you may
> > want to put it behind a link.
> >
> > As for the idea itself,
> >
> > What I plan to do is to let Hudi support release a snapshot view and
> > > lifecycle management out-of-box.
> >
> >
> >  Are you planning to extend the savepoint feature to have lifecycle mgmt
> > capabilities? We should consolidate overlapping features properly.
> >
> > On Sun, Aug 21, 2022 at 12:59 PM 冯健 <[email protected]> wrote:
> >
> > > Hi team,
> > > [image: image.png]
> > >     for the snapshot view scenario, Hudi already provides two key
> > > features to support it:
> > >
> > >    - Time travel: user provides a timestamp to query a specific
> snapshot
> > >    view of a Hudi table
> > >    - Savepoint/restore: "savepoint" saves the table as of the commit
> time
> > >    so that it lets you restore the table to this savepoint at a later
> > point in
> > >    time if need be. but in this case, the user usually uses this to
> > prevent
> > >    cleaning snapshot view at a specific timestamp, only clean unused
> > files
> > >
> > > The situation is there some inconvenience for users if use them
> directly
> > >
> > >    - Usually users incline to use a meaningful name instead of querying
> > >    Hudi table with a timestamp, using the timestamp in SQL may lead to
> > the
> > >    wrong snapshot view being used. for example, we can announce that a
> > new tag
> > >    of hudi table with table_nameYYYYMMDD was released, then the user
> can
> > use
> > >    this new table name to query.
> > >    - Savepoint is not designed for this "snapshot view" scenario in the
> > >    beginning, it is designed for disaster recovery. let's say a new
> > snapshot
> > >    view will be created every day, and it has 7 days retention, we
> should
> > >    support lifecycle management on top of it.
> > >
> > > What I plan to do is to let Hudi support release a snapshot view and
> > > lifecycle management out-of-box. We have already done some work when
> > > supporting customers' snapshot view requirements in my company, and
> hope
> > to
> > > land this feature in Community too.
> > >
> > > Please feel free to let me know if you have any idea about this.
> > >
> > > Thanks,
> > >
> > > Jian Feng
> > >
> >
> >
> > --
> > Best,
> > Shiyan
> >
>


-- 
Regards,
-Sivabalan

Reply via email to