n3nash commented on issue #2743:
URL: https://github.com/apache/hudi/issues/2743#issuecomment-815456351


   @aditiwari01 I think you mentioned 2 issues here 
   
   1. Record level TTL -> We don't have such a feature in Hudi. Like others 
have pointed out, using the `hudiTable.deletePartitions()` API is a way to 
manage older partitions. Yes, you could partition based on _hoodie_commit_time 
or any other date based partitioning that structures your table to be eligible 
for deleting older partitions completely. 
   2. Duplicates across partitions -> If you have an update workload and are 
using the `upsert` API, yes, using a GlobalIndex will help eliminate duplicates 
for your table. 
   
   As @nsivabalan pointed out, we don't have such support out of the spark 
datasource but have a low level API as pointed above. We welcome contributions 
and would be good to add this support in spark datasource - let me know if you 
want to contribute this feature and we can guide you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to