[ https://issues.apache.org/jira/browse/HUDI-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sagar Sumit updated HUDI-8208: ------------------------------ Remaining Estimate: 12h (was: 8h) > Fix partition stats with compaction or clustering > ------------------------------------------------- > > Key: HUDI-8208 > URL: https://issues.apache.org/jira/browse/HUDI-8208 > Project: Apache Hudi > Issue Type: Bug > Components: metadata > Reporter: Lokesh Jain > Assignee: Lokesh Jain > Priority: Blocker > Fix For: 1.0.0 > > Original Estimate: 8h > Time Spent: 8h > Remaining Estimate: 12h > > Consider a partition with 10 file slices. If compaction triggered for 1 file > slice fs1_1, the partition stats are updated for that file slice with the > same key (partition path). The older partition stat record for that partition > path would account for the other 9 file slices (fs2_0 - fs10_0) + the older > stat (fs1_0). The final read value would be merging of all versions of file > slices (fs2_0 - fs10_0, fs1_0, fs1_1). It should only account for the latest > version of fs1. > Upon compaction or clustering, the partition stat should be recomputed and > the older records for that partition should be invalidated. > Also add a validation test in > org.apache.hudi.utilities.TestHoodieMetadataTableValidator#testPartitionStatsValidation -- This message was sent by Atlassian Jira (v8.20.10#820010)