nsivabalan commented on code in PR #17943:
URL: https://github.com/apache/hudi/pull/17943#discussion_r2848970127
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -154,12 +156,37 @@ public List<String>
getPartitionPathsToClean(Option<HoodieInstant> earliestRetai
case KEEP_LATEST_BY_HOURS:
return getPartitionPathsForCleanByCommits(earliestRetainedInstant);
case KEEP_LATEST_FILE_VERSIONS:
+ if (canCleanBeSkipped()) {
Review Comment:
why just `KEEP_LATEST_FILE_VERSIONS`
for OOB users, mdt cleaner is derived based on dt cleaner configs right.
and OOB is clean based on num commits.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -154,12 +156,37 @@ public List<String>
getPartitionPathsToClean(Option<HoodieInstant> earliestRetai
case KEEP_LATEST_BY_HOURS:
return getPartitionPathsForCleanByCommits(earliestRetainedInstant);
case KEEP_LATEST_FILE_VERSIONS:
+ if (canCleanBeSkipped()) {
+ return Collections.emptyList();
+ }
return getPartitionPathsForFullCleaning();
default:
throw new IllegalStateException("Unknown Cleaner Policy");
}
}
+ /**
+ * Returns true if clean can be skipped for MOR metadata tables when only
delta commits occurred after the last clean.
+ */
+ private boolean canCleanBeSkipped() {
Review Comment:
minor.
`canSkipClean()`
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -154,12 +156,37 @@ public List<String>
getPartitionPathsToClean(Option<HoodieInstant> earliestRetai
case KEEP_LATEST_BY_HOURS:
return getPartitionPathsForCleanByCommits(earliestRetainedInstant);
case KEEP_LATEST_FILE_VERSIONS:
+ if (canCleanBeSkipped()) {
+ return Collections.emptyList();
+ }
return getPartitionPathsForFullCleaning();
default:
throw new IllegalStateException("Unknown Cleaner Policy");
}
}
+ /**
+ * Returns true if clean can be skipped for MOR metadata tables when only
delta commits occurred after the last clean.
Review Comment:
fix documentation to call out metadata table.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]