[ https://issues.apache.org/jira/browse/HUDI-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lokesh Jain updated HUDI-8208: ------------------------------ Sprint: Hudi 1.0 Sprint 2024/09/16-22 > Fix partition stats with compaction or clustering > ------------------------------------------------- > > Key: HUDI-8208 > URL: https://issues.apache.org/jira/browse/HUDI-8208 > Project: Apache Hudi > Issue Type: Bug > Components: metadata > Reporter: Lokesh Jain > Assignee: Lokesh Jain > Priority: Major > Fix For: 1.0.0 > > > Consider a partition with 10 file slices. If compaction triggered for 1 file > slice fs1_1, the partition stats are updated for that file slice with the > same key (partition path). The older partition stat record for that partition > path would account for the other 9 file slices (fs2_0 - fs10_0) + the older > stat (fs1_0). The final read value would be merging of all versions of file > slices (fs2_0 - fs10_0, fs1_0, fs1_1). It should only account for the latest > version of fs1. > Upon compaction or clustering, the partition stat should be recomputed and > the older records for that partition should be invalidated. -- This message was sent by Atlassian Jira (v8.20.10#820010)