n3nash commented on issue #2743:
URL: https://github.com/apache/hudi/issues/2743#issuecomment-815456351
@aditiwari01 I think you mentioned 2 issues here
1. Record level TTL -> We don't have such a feature in Hudi. Like others
have pointed out, using the `hudiTable.deletePartitions()` API is a way to
manage older partitions. Yes, you could partition based on _hoodie_commit_time
or any other date based partitioning that structures your table to be eligible
for deleting older partitions completely.
2. Duplicates across partitions -> If you have an update workload and are
using the `upsert` API, yes, using a GlobalIndex will help eliminate duplicates
for your table.
As @nsivabalan pointed out, we don't have such support out of the spark
datasource but have a low level API as pointed above. We welcome contributions
and would be good to add this support in spark datasource - let me know if you
want to contribute this feature and we can guide you
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org