nsivabalan commented on code in PR #5478:
URL: https://github.com/apache/hudi/pull/5478#discussion_r964245759


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanActionExecutor.java:
##########
@@ -64,11 +65,16 @@ private int getCommitsSinceLastCleaning() {
     Option<HoodieInstant> lastCleanInstant = 
table.getActiveTimeline().getCleanerTimeline().filterCompletedInstants().lastInstant();
     HoodieTimeline commitTimeline = 
table.getActiveTimeline().getCommitsTimeline().filterCompletedInstants();
 
-    String latestCleanTs;
-    int numCommits = 0;
-    if (lastCleanInstant.isPresent()) {
-      latestCleanTs = lastCleanInstant.get().getTimestamp();
-      numCommits = 
commitTimeline.findInstantsAfter(latestCleanTs).countInstants();
+    int numCommits;
+    if (lastCleanInstant.isPresent() && 
!table.getActiveTimeline().isEmpty(lastCleanInstant.get())) {
+      try {
+        HoodieCleanMetadata cleanMetadata = TimelineMetadataUtils
+            
.deserializeHoodieCleanMetadata(table.getActiveTimeline().getInstantDetails(lastCleanInstant.get()).get());
+        String lastCompletedCommitTimestamp = 
cleanMetadata.getLastCompletedCommitTimestamp();
+        numCommits = 
commitTimeline.findInstantsAfter(lastCompletedCommitTimestamp).countInstants();
+      } catch (IOException e) {
+        throw new HoodieIOException(e.getMessage(), e);

Review Comment:
   exception (2nd arg) will carry the msg anyways. can we fix the first 
argument w/ custom msg ("Parsing of last clean instant " + 
lastCleanInstant.get() + " failed") 



##########
hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java:
##########
@@ -502,14 +502,20 @@ public void handle(@NotNull Context context) throws 
Exception {
         if (refreshCheck) {
           long beginFinalCheck = System.currentTimeMillis();
           if (isLocalViewBehind(context)) {
-            String errMsg =
-                "Last known instant from client was "
-                    + 
context.queryParam(RemoteHoodieTableFileSystemView.LAST_INSTANT_TS,
-                        HoodieTimeline.INVALID_INSTANT_TS)
-                    + " but server has the following timeline "
-                    + 
viewManager.getFileSystemView(context.queryParam(RemoteHoodieTableFileSystemView.BASEPATH_PARAM))
-                        
.getTimeline().getInstants().collect(Collectors.toList());
-            throw new BadRequestResponse(errMsg);
+            String lastInstantTs = 
context.queryParam(RemoteHoodieTableFileSystemView.LAST_INSTANT_TS,
+                HoodieTimeline.INVALID_INSTANT_TS);
+            HoodieTimeline localTimeline =
+                
viewManager.getFileSystemView(context.queryParam(RemoteHoodieTableFileSystemView.BASEPATH_PARAM)).getTimeline();
+            HoodieTimeline afterLastInstantTimeLine = 
localTimeline.findInstantsAfter(lastInstantTs).filterCompletedInstants();
+            if (!(afterLastInstantTimeLine.countInstants() == 1

Review Comment:
   So, are we making an exception to just 1 case here. 
   i.e. when client has 1 extra commit compared to timeline server and if that 
is a clean action, we don't trigger a refresh ? 
   can you move this to a separate method. I see we might potentially add more 
cases in here going forward. 
   
   



##########
hudi-common/src/main/avro/HoodieCleanerPlan.avsc:
##########
@@ -42,6 +42,11 @@
       }],
       "default" : null
     },
+    {
+      "name": "lastCompletedCommitTimestamp",
+      "type": "string",
+      "default" : ""

Review Comment:
   same here.



##########
hudi-common/src/main/avro/HoodieCleanMetadata.avsc:
##########
@@ -23,6 +23,7 @@
      {"name": "timeTakenInMillis", "type": "long"},
      {"name": "totalFilesDeleted", "type": "int"},
      {"name": "earliestCommitToRetain", "type": "string"},
+     {"name": "lastCompletedCommitTimestamp", "type": "string", "default" : 
""},

Review Comment:
   why empty string. we could go w/ null ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to