Looking forward to the RFC
It's a good idea, we also need hudi data TTL in some case
Do we have any plan or time to do this? We also had some simple designs to 
implement it
Maybe we can had a talk about it

在 2022/10/20 上午9:47,“Bingeng 
Huang”<[email protected] 代表 [email protected]> 
写入:

    Looking forward to the RFC.
    We can propose RFC about support TTL config using non-partition field after



    sagar sumit <[email protected]> 于2022年10月19日周三 14:42写道:

    > +1 Very nice idea. Looking forward to the RFC!
    >
    > On Wed, Oct 19, 2022 at 10:13 AM Shiyan Xu <[email protected]>
    > wrote:
    >
    > > great proposal. Partition TTL is a good starting point. we can extend it
    > to
    > > other TTL strategies like column-based, and make it customizable and
    > > pluggable. Looking forward to the RFC!
    > >
    > > On Wed, Oct 19, 2022 at 11:40 AM Jian Feng <[email protected]
    > >
    > > wrote:
    > >
    > > > Good idea,
    > > > this is definitely worth an  RFC
    > > > btw should it only depend on Hudi's partition? I feel it should be a
    > more
    > > > common feature since sometimes customers' data can not update across
    > > > partitions
    > > >
    > > >
    > > > On Wed, Oct 19, 2022 at 11:07 AM stream2000 <[email protected]>
    > wrote:
    > > >
    > > > > Hi all, we have implemented a partition based data ttl management,
    > > which
    > > > > we can manage ttl for hudi partition by size, expired time and
    > > > > sub-partition count. When a partition is detected as outdated, we 
use
    > > > > delete partition interface to delete it, which will generate a
    > replace
    > > > > commit to mark the data as deleted. The real deletion will then done
    > by
    > > > > clean service.
    > > > >
    > > > >
    > > > > If community is interested in this idea, maybe we can propose a RFC
    > to
    > > > > discuss it in detail.
    > > > >
    > > > >
    > > > > > On Oct 19, 2022, at 10:06, Vinoth Chandar <[email protected]>
    > wrote:
    > > > > >
    > > > > > +1 love to discuss this on a RFC proposal.
    > > > > >
    > > > > > On Tue, Oct 18, 2022 at 13:11 Alexey Kudinkin <[email protected]>
    > > > > wrote:
    > > > > >
    > > > > >> That's a very interesting idea.
    > > > > >>
    > > > > >> Do you want to take a stab at writing a full proposal (in the 
form
    > > of
    > > > > RFC)
    > > > > >> for it?
    > > > > >>
    > > > > >> On Tue, Oct 18, 2022 at 10:20 AM Bingeng Huang <
    > [email protected]
    > > >
    > > > > >> wrote:
    > > > > >>
    > > > > >>> Hi all,
    > > > > >>>
    > > > > >>> Do we have plan to integrate data TTL into HUDI, so we don't 
have
    > > to
    > > > > >>> schedule a offline spark job to delete outdated data, just set a
    > > TTL
    > > > > >>> config, then writer or some offline service will delete old data
    > as
    > > > > >>> expected.
    > > > > >>>
    > > > > >>
    > > > >
    > > > >
    > > >
    > > > --
    > > > *Jian Feng,冯健*
    > > > Shopee | Engineer | Data Infrastructure
    > > >
    > >
    > >
    > > --
    > > Best,
    > > Shiyan
    > >
    >


Reply via email to