[ https://issues.apache.org/jira/browse/HUDI-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sivabalan narayanan reassigned HUDI-1590: ----------------------------------------- Assignee: Sagar Sumit (was: sivabalan narayanan) > Support async clustering w/ test suite job > ------------------------------------------ > > Key: HUDI-1590 > URL: https://issues.apache.org/jira/browse/HUDI-1590 > Project: Apache Hudi > Issue Type: Test > Components: Testing, tests-ci > Affects Versions: 0.9.0 > Reporter: sivabalan narayanan > Assignee: Sagar Sumit > Priority: Major > Fix For: 0.12.0 > > Original Estimate: 8h > Remaining Estimate: 8h > > As of now, we only have inline clustering support w/ hoodie test suite job. > we need to add support for async clustering. > This might be tricky since the regular writes should not overstep w/ > clustering. if not the pipeline will fail. So, data generation has to go hand > in hand w/ clustering configs. For eg, if clustering will get triggered every > 4 commits, data generation should switch partitions for every 4 batches of > input. That way there won't be any overstepping and pipeline can run for as > many iterations as needed. -- This message was sent by Atlassian Jira (v8.20.7#820007)