Hi community, we have submitted RFC-65 Partition TTL Management in this pr: 
https://github.com/apache/hudi/pull/8062.<br/><br/>Let me know if you have any 
questions or concerns with this proposal.
At 2022-10-21 14:42:10, "stream2000" <18889897...@163.com> wrote:
>Yes we can have a talk about it. We will try our best to write the RFC, maybe 
>publish it in a few weeks.
>
>
>> On Oct 21, 2022, at 10:18, JerryYue <272614...@qq.com.INVALID> wrote:
>> 
>> Looking forward to the RFC
>> It's a good idea, we also need hudi data TTL in some case
>> Do we have any plan or time to do this? We also had some simple designs to 
>> implement it
>> Maybe we can had a talk about it
>> 
>> 在 2022/10/20 上午9:47,“Bingeng 
>> Huang”<dev-return-5022-272614347=qq....@hudi.apache.org 代表 
>> hbgstc...@gmail.com> 写入:
>> 
>>    Looking forward to the RFC.
>>    We can propose RFC about support TTL config using non-partition field 
>> after
>> 
>> 
>> 
>>    sagar sumit <cod...@apache.org> 于2022年10月19日周三 14:42写道:
>> 
>>> +1 Very nice idea. Looking forward to the RFC!
>>> 
>>> On Wed, Oct 19, 2022 at 10:13 AM Shiyan Xu <xu.shiyan.raym...@gmail.com>
>>> wrote:
>>> 
>>>> great proposal. Partition TTL is a good starting point. we can extend it
>>> to
>>>> other TTL strategies like column-based, and make it customizable and
>>>> pluggable. Looking forward to the RFC!
>>>> 
>>>> On Wed, Oct 19, 2022 at 11:40 AM Jian Feng <jian.f...@shopee.com.invalid
>>>> 
>>>> wrote:
>>>> 
>>>>> Good idea,
>>>>> this is definitely worth an  RFC
>>>>> btw should it only depend on Hudi's partition? I feel it should be a
>>> more
>>>>> common feature since sometimes customers' data can not update across
>>>>> partitions
>>>>> 
>>>>> 
>>>>> On Wed, Oct 19, 2022 at 11:07 AM stream2000 <18889897...@163.com>
>>> wrote:
>>>>> 
>>>>>> Hi all, we have implemented a partition based data ttl management,
>>>> which
>>>>>> we can manage ttl for hudi partition by size, expired time and
>>>>>> sub-partition count. When a partition is detected as outdated, we use
>>>>>> delete partition interface to delete it, which will generate a
>>> replace
>>>>>> commit to mark the data as deleted. The real deletion will then done
>>> by
>>>>>> clean service.
>>>>>> 
>>>>>> 
>>>>>> If community is interested in this idea, maybe we can propose a RFC
>>> to
>>>>>> discuss it in detail.
>>>>>> 
>>>>>> 
>>>>>>> On Oct 19, 2022, at 10:06, Vinoth Chandar <vin...@apache.org>
>>> wrote:
>>>>>>> 
>>>>>>> +1 love to discuss this on a RFC proposal.
>>>>>>> 
>>>>>>> On Tue, Oct 18, 2022 at 13:11 Alexey Kudinkin <ale...@onehouse.ai>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> That's a very interesting idea.
>>>>>>>> 
>>>>>>>> Do you want to take a stab at writing a full proposal (in the form
>>>> of
>>>>>> RFC)
>>>>>>>> for it?
>>>>>>>> 
>>>>>>>> On Tue, Oct 18, 2022 at 10:20 AM Bingeng Huang <
>>> hbgstc...@gmail.com
>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> Do we have plan to integrate data TTL into HUDI, so we don't have
>>>> to
>>>>>>>>> schedule a offline spark job to delete outdated data, just set a
>>>> TTL
>>>>>>>>> config, then writer or some offline service will delete old data
>>> as
>>>>>>>>> expected.
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> *Jian Feng,冯健*
>>>>> Shopee | Engineer | Data Infrastructure
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best,
>>>> Shiyan
>>>> 
>>> 
>> 

Reply via email to