[ https://issues.apache.org/jira/browse/HUDI-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Guo closed HUDI-7104. --------------------------- Resolution: Fixed > Cleaner could miss to clean up some files w/ savepoint interplay > ----------------------------------------------------------------- > > Key: HUDI-7104 > URL: https://issues.apache.org/jira/browse/HUDI-7104 > Project: Apache Hudi > Issue Type: Bug > Components: cleaning, savepoint, table-service > Reporter: sivabalan narayanan > Assignee: sivabalan narayanan > Priority: Major > Labels: pull-request-available > Fix For: 0.15.0, 1.0.0 > > > Lets say partitioning is day based and is based on created date. So, older > partitions generally does not get any new data after few days. > > Lets say we have savepoints added to a day and later removed. > day 1: cleaned up. > day2: savepoint added. and so cleaner ignord. > day3: cleaned up > day4: earliest commit to retain based on cleaner configs. > > So, w/ this table/timeline state, if we remove the savepointed commit, data > pertaining to day2 will never be cleaned by the cleaner since its lesser than > the earliest commit to retain. > -- This message was sent by Atlassian Jira (v8.20.10#820010)