Looking forward to the RFC. We can propose RFC about support TTL config using non-partition field after
sagar sumit <[email protected]> 于2022年10月19日周三 14:42写道: > +1 Very nice idea. Looking forward to the RFC! > > On Wed, Oct 19, 2022 at 10:13 AM Shiyan Xu <[email protected]> > wrote: > > > great proposal. Partition TTL is a good starting point. we can extend it > to > > other TTL strategies like column-based, and make it customizable and > > pluggable. Looking forward to the RFC! > > > > On Wed, Oct 19, 2022 at 11:40 AM Jian Feng <[email protected] > > > > wrote: > > > > > Good idea, > > > this is definitely worth an RFC > > > btw should it only depend on Hudi's partition? I feel it should be a > more > > > common feature since sometimes customers' data can not update across > > > partitions > > > > > > > > > On Wed, Oct 19, 2022 at 11:07 AM stream2000 <[email protected]> > wrote: > > > > > > > Hi all, we have implemented a partition based data ttl management, > > which > > > > we can manage ttl for hudi partition by size, expired time and > > > > sub-partition count. When a partition is detected as outdated, we use > > > > delete partition interface to delete it, which will generate a > replace > > > > commit to mark the data as deleted. The real deletion will then done > by > > > > clean service. > > > > > > > > > > > > If community is interested in this idea, maybe we can propose a RFC > to > > > > discuss it in detail. > > > > > > > > > > > > > On Oct 19, 2022, at 10:06, Vinoth Chandar <[email protected]> > wrote: > > > > > > > > > > +1 love to discuss this on a RFC proposal. > > > > > > > > > > On Tue, Oct 18, 2022 at 13:11 Alexey Kudinkin <[email protected]> > > > > wrote: > > > > > > > > > >> That's a very interesting idea. > > > > >> > > > > >> Do you want to take a stab at writing a full proposal (in the form > > of > > > > RFC) > > > > >> for it? > > > > >> > > > > >> On Tue, Oct 18, 2022 at 10:20 AM Bingeng Huang < > [email protected] > > > > > > > >> wrote: > > > > >> > > > > >>> Hi all, > > > > >>> > > > > >>> Do we have plan to integrate data TTL into HUDI, so we don't have > > to > > > > >>> schedule a offline spark job to delete outdated data, just set a > > TTL > > > > >>> config, then writer or some offline service will delete old data > as > > > > >>> expected. > > > > >>> > > > > >> > > > > > > > > > > > > > > -- > > > *Jian Feng,冯健* > > > Shopee | Engineer | Data Infrastructure > > > > > > > > > -- > > Best, > > Shiyan > > >
