Yue Zhang created HUDI-2683: ------------------------------- Summary: Parallelize deleting archived hoodie commits Key: HUDI-2683 URL: https://issues.apache.org/jira/browse/HUDI-2683 Project: Apache Hudi Issue Type: Task Reporter: Yue Zhang
For now, hoodie will use 5s to delete 30 archived commits, even worse for bigger archive threshold like set archive.max_commits 100 or larger. This is because of hoodie deleting archived commits in driver serially. Sometimes, it is unacceptable for Spark Streaming jobs with second level batch interval. We need to delete archived commits in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005)