Nicholas Jiang created HUDI-6158: ------------------------------------ Summary: Strengthen Flink clustering commit and rollback strategy Key: HUDI-6158 URL: https://issues.apache.org/jira/browse/HUDI-6158 Project: Apache Hudi Issue Type: Improvement Components: flink Reporter: Nicholas Jiang Assignee: Nicholas Jiang Fix For: 0.14.0
`ClusteringCommitSink` could strengthen commit and rollback strategy from two solutions: * Commit: Introduces `clusteringPlanCache` that caches to store clustering plan for each instant. `clusteringPlanCache` stores the mapping of instant_time -> clusteringPlan. * Rolback: Updates `commitBuffer` that stores the mapping of instant_time -> file_ids -> event. Use a map to collect the events because the rolling back of intermediate clustering tasks generates corrupt events. -- This message was sent by Atlassian Jira (v8.20.10#820010)