left some comments. thanks!

On Fri, 31 Mar 2023 at 00:59, 符其军 <18889897...@163.com> wrote:

> Hi community, we have submitted RFC-65 Partition TTL Management in this
> pr: https://github.com/apache/hudi/pull/8062.<br/><br/>Let me know if you
> have any questions or concerns with this proposal.
> At 2022-10-21 14:42:10, "stream2000" <18889897...@163.com> wrote:
> >Yes we can have a talk about it. We will try our best to write the RFC,
> maybe publish it in a few weeks.
> >
> >
> >> On Oct 21, 2022, at 10:18, JerryYue <272614...@qq.com.INVALID> wrote:
> >>
> >> Looking forward to the RFC
> >> It's a good idea, we also need hudi data TTL in some case
> >> Do we have any plan or time to do this? We also had some simple designs
> to implement it
> >> Maybe we can had a talk about it
> >>
> >> 在 2022/10/20 上午9:47,“Bingeng Huang”<dev-return-5022-272614347=
> qq....@hudi.apache.org 代表 hbgstc...@gmail.com> 写入:
> >>
> >>    Looking forward to the RFC.
> >>    We can propose RFC about support TTL config using non-partition
> field after
> >>
> >>
> >>
> >>    sagar sumit <cod...@apache.org> 于2022年10月19日周三 14:42写道:
> >>
> >>> +1 Very nice idea. Looking forward to the RFC!
> >>>
> >>> On Wed, Oct 19, 2022 at 10:13 AM Shiyan Xu <
> xu.shiyan.raym...@gmail.com>
> >>> wrote:
> >>>
> >>>> great proposal. Partition TTL is a good starting point. we can extend
> it
> >>> to
> >>>> other TTL strategies like column-based, and make it customizable and
> >>>> pluggable. Looking forward to the RFC!
> >>>>
> >>>> On Wed, Oct 19, 2022 at 11:40 AM Jian Feng
> <jian.f...@shopee.com.invalid
> >>>>
> >>>> wrote:
> >>>>
> >>>>> Good idea,
> >>>>> this is definitely worth an  RFC
> >>>>> btw should it only depend on Hudi's partition? I feel it should be a
> >>> more
> >>>>> common feature since sometimes customers' data can not update across
> >>>>> partitions
> >>>>>
> >>>>>
> >>>>> On Wed, Oct 19, 2022 at 11:07 AM stream2000 <18889897...@163.com>
> >>> wrote:
> >>>>>
> >>>>>> Hi all, we have implemented a partition based data ttl management,
> >>>> which
> >>>>>> we can manage ttl for hudi partition by size, expired time and
> >>>>>> sub-partition count. When a partition is detected as outdated, we
> use
> >>>>>> delete partition interface to delete it, which will generate a
> >>> replace
> >>>>>> commit to mark the data as deleted. The real deletion will then done
> >>> by
> >>>>>> clean service.
> >>>>>>
> >>>>>>
> >>>>>> If community is interested in this idea, maybe we can propose a RFC
> >>> to
> >>>>>> discuss it in detail.
> >>>>>>
> >>>>>>
> >>>>>>> On Oct 19, 2022, at 10:06, Vinoth Chandar <vin...@apache.org>
> >>> wrote:
> >>>>>>>
> >>>>>>> +1 love to discuss this on a RFC proposal.
> >>>>>>>
> >>>>>>> On Tue, Oct 18, 2022 at 13:11 Alexey Kudinkin <ale...@onehouse.ai>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> That's a very interesting idea.
> >>>>>>>>
> >>>>>>>> Do you want to take a stab at writing a full proposal (in the form
> >>>> of
> >>>>>> RFC)
> >>>>>>>> for it?
> >>>>>>>>
> >>>>>>>> On Tue, Oct 18, 2022 at 10:20 AM Bingeng Huang <
> >>> hbgstc...@gmail.com
> >>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi all,
> >>>>>>>>>
> >>>>>>>>> Do we have plan to integrate data TTL into HUDI, so we don't have
> >>>> to
> >>>>>>>>> schedule a offline spark job to delete outdated data, just set a
> >>>> TTL
> >>>>>>>>> config, then writer or some offline service will delete old data
> >>> as
> >>>>>>>>> expected.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> *Jian Feng,冯健*
> >>>>> Shopee | Engineer | Data Infrastructure
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best,
> >>>> Shiyan
> >>>>
> >>>
> >>
>


-- 
Regards,
-Sivabalan

Reply via email to