Yue Zhang created HUDI-2683:
-------------------------------

             Summary: Parallelize deleting archived hoodie commits 
                 Key: HUDI-2683
                 URL: https://issues.apache.org/jira/browse/HUDI-2683
             Project: Apache Hudi
          Issue Type: Task
            Reporter: Yue Zhang


For now, hoodie will use 5s to delete 30 archived commits, even worse for 
bigger archive threshold like set archive.max_commits 100 or larger.

This is because of hoodie deleting archived commits in driver serially.

Sometimes, it is unacceptable for Spark Streaming jobs with second level batch 
interval.

We need to delete archived commits in parallel.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to