Hi Sagar,
HMS shouldn't be the core part, the external table location will depend on
which metastore the user is using.
 I'm still working on it, will add more detail in this RFC pr.
https://github.com/apache/hudi/pull/6576


On Fri, 16 Sept 2022 at 11:28, sagar sumit <cod...@apache.org> wrote:

> Automatic lifecycle management based on a few configurations
> would be very useful for the community.
>
> I read the description in
> https://issues.apache.org/jira/browse/HUDI-4677
> May I ask the rationale for choosing
> Hive Metastore to manage the snapshots?
>
> Perhaps, RFC would have more details. Looking forward to it!
>
> Regards,
> Sagar
>
>
> On Wed, Sep 14, 2022 at 8:13 AM 冯健 <fengjian...@gmail.com> wrote:
>
> > Hi Ethan,
> >
> >     Yes, based on the current situation, we still need to do much extra
> > work to provide snapshot view feature for the users( or users do this by
> > themself)
> >     . I plan to merge the COW part of this feature to 0.13.0 at least.
> will
> > consider your suggestion if time is tight
> > Thanks
> >
> >
> >
> > On Wed, 14 Sept 2022 at 03:02, Y Ethan Guo <yi...@apache.org> wrote:
> >
> > > Hi Feng Jian,
> > >
> > > Looking forward to the RFC!  Is the snapshot view management more like
> > > managing commits / savepoints in the Hudi timeline and hiding Hudi
> > > internals from the users?
> > >
> > > Do you plan to merge the implementation of snapshot view and lifecycle
> > > management for the next major release (0.13.0)?  Timeline-wise, if time
> > is
> > > tight, you may also consider scoping out a subset of features to target
> > > 0.13.0.
> > >
> > > Best,
> > > - Ethan
> > >
> > > On Mon, Sep 12, 2022 at 10:43 PM Sivabalan <n.siv...@gmail.com> wrote:
> > >
> > > > Sounds like a nice feature to have. Eagerly looking forward for the
> > RFC.
> > > >
> > > > On Sat, 27 Aug 2022 at 20:51, 冯健 <fengjian...@gmail.com> wrote:
> > > >
> > > > > I attached the image in this Jira Epic
> > > > > https://issues.apache.org/jira/browse/HUDI-4677, and the RFC is
> WIP,
> > > > will
> > > > > create a pr in the next few days
> > > > > Yeah, the basic idea is to implement lifecycle management based on
> > the
> > > > > savepoint and time travel features, providing new ways for the user
> > to
> > > > > operate
> > > > > and coordinate. won't propose any new concept
> > > > >
> > > > > On Sun, 28 Aug 2022 at 02:06, Shiyan Xu <
> xu.shiyan.raym...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > The dev email list does not support showing images unfortunately.
> > you
> > > > may
> > > > > > want to put it behind a link.
> > > > > >
> > > > > > As for the idea itself,
> > > > > >
> > > > > > What I plan to do is to let Hudi support release a snapshot view
> > and
> > > > > > > lifecycle management out-of-box.
> > > > > >
> > > > > >
> > > > > >  Are you planning to extend the savepoint feature to have
> lifecycle
> > > > mgmt
> > > > > > capabilities? We should consolidate overlapping features
> properly.
> > > > > >
> > > > > > On Sun, Aug 21, 2022 at 12:59 PM 冯健 <fengjian...@gmail.com>
> wrote:
> > > > > >
> > > > > > > Hi team,
> > > > > > > [image: image.png]
> > > > > > >     for the snapshot view scenario, Hudi already provides two
> key
> > > > > > > features to support it:
> > > > > > >
> > > > > > >    - Time travel: user provides a timestamp to query a specific
> > > > > snapshot
> > > > > > >    view of a Hudi table
> > > > > > >    - Savepoint/restore: "savepoint" saves the table as of the
> > > commit
> > > > > time
> > > > > > >    so that it lets you restore the table to this savepoint at a
> > > later
> > > > > > point in
> > > > > > >    time if need be. but in this case, the user usually uses
> this
> > to
> > > > > > prevent
> > > > > > >    cleaning snapshot view at a specific timestamp, only clean
> > > unused
> > > > > > files
> > > > > > >
> > > > > > > The situation is there some inconvenience for users if use them
> > > > > directly
> > > > > > >
> > > > > > >    - Usually users incline to use a meaningful name instead of
> > > > querying
> > > > > > >    Hudi table with a timestamp, using the timestamp in SQL may
> > lead
> > > > to
> > > > > > the
> > > > > > >    wrong snapshot view being used. for example, we can announce
> > > that
> > > > a
> > > > > > new tag
> > > > > > >    of hudi table with table_nameYYYYMMDD was released, then the
> > > user
> > > > > can
> > > > > > use
> > > > > > >    this new table name to query.
> > > > > > >    - Savepoint is not designed for this "snapshot view"
> scenario
> > in
> > > > the
> > > > > > >    beginning, it is designed for disaster recovery. let's say a
> > new
> > > > > > snapshot
> > > > > > >    view will be created every day, and it has 7 days retention,
> > we
> > > > > should
> > > > > > >    support lifecycle management on top of it.
> > > > > > >
> > > > > > > What I plan to do is to let Hudi support release a snapshot
> view
> > > and
> > > > > > > lifecycle management out-of-box. We have already done some work
> > > when
> > > > > > > supporting customers' snapshot view requirements in my company,
> > and
> > > > > hope
> > > > > > to
> > > > > > > land this feature in Community too.
> > > > > > >
> > > > > > > Please feel free to let me know if you have any idea about
> this.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jian Feng
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best,
> > > > > > Shiyan
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > -Sivabalan
> > > >
> > >
> >
>

Reply via email to