hudi-bot opened a new issue, #14770:
URL: https://github.com/apache/hudi/issues/14770

   For e:g : Have records only updated last month 
   
    
   
   GH: https://github.com/apache/hudi/issues/2743
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-1741
   - Type: New Feature
   
   
   ---
   
   
   ## Comments
   
   31/Mar/21 00:43;vbalaji;[~shivnarayan] : FYI;;;
   
   ---
   
   03/Apr/21 16:28;pratyakshsharma;Guess the same can be handled with this Jira 
- https://issues.apache.org/jira/browse/HUDI-349? [~vbalaji] [~shivnarayan];;;
   
   ---
   
   05/Apr/21 15:25;aditiwari;[~pratyakshsharma] I guess with time based 
cleaning policy, we might need some modifications in compactor as well. 
   
   For a recently updated base file also some of its records might be older.
   
   
   Time based cleaner and filtering out records with older commit time while 
compacting(in MOR) or rewriting(in COW) base file should solve the issue.;;;
   
   ---
   
   28/Oct/22 03:09;nicholasjiang;[~shivnarayan], IMO, each record of hudi has 
the commit time of hudi. The solution is to first follow the TTL, do not 
display expired data when checking, or even push down to the data source 
directly, and then delete it when doing operations such as clustering that need 
to rewrite the data. WDYT?
   
   cc [~xleesf] ;;;
   
   ---
   
   28/Oct/22 03:29;xleesf;[~nicholasjiang] agree with the solution;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to