yihua commented on code in PR #9473:
URL: https://github.com/apache/hudi/pull/9473#discussion_r1320357906


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/IncrSourceHelper.java:
##########
@@ -130,7 +130,7 @@ public static QueryInfo generateQueryInfo(JavaSparkContext 
jssc, String srcBaseP
       }
     });
 

Review Comment:
   ```suggestion
   
   // When `beginInstantTime` is `DEFAULT_BEGIN_TIMESTAMP` (due to missing 
checkpoint), `previousInstantTime` is set to `DEFAULT_BEGIN_TIMESTAMP` as well.
   ```



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/IncrSourceHelper.java:
##########
@@ -130,7 +130,7 @@ public static QueryInfo generateQueryInfo(JavaSparkContext 
jssc, String srcBaseP
       }
     });
 
-    String previousInstantTime = beginInstantTime;
+    String previousInstantTime = DEFAULT_BEGIN_TIMESTAMP;
     if (!beginInstantTime.equals(DEFAULT_BEGIN_TIMESTAMP)) {

Review Comment:
   ```suggestion
   // When `beginInstantTime` is present, `previousInstantTime` is set to the 
completed commit before `beginInstantTime` if that exists.  If there is no 
completed commit before `beginInstantTime`, e.g., `beginInstantTime` is the 
first commit in the active timeline, `previousInstantTime` is set to 
`DEFAULT_BEGIN_TIMESTAMP`.
       if (!beginInstantTime.equals(DEFAULT_BEGIN_TIMESTAMP)) {
   ```



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/IncrSourceHelper.java:
##########
@@ -130,7 +130,7 @@ public static QueryInfo generateQueryInfo(JavaSparkContext 
jssc, String srcBaseP
       }
     });
 
-    String previousInstantTime = beginInstantTime;
+    String previousInstantTime = DEFAULT_BEGIN_TIMESTAMP;

Review Comment:
   Have you also gone through the logic for 
`MissingCheckpointStrategy.READ_LATEST` and see if there's any gap?



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/IncrSourceHelper.java:
##########
@@ -130,7 +130,7 @@ public static QueryInfo generateQueryInfo(JavaSparkContext 
jssc, String srcBaseP
       }
     });
 
-    String previousInstantTime = beginInstantTime;
+    String previousInstantTime = DEFAULT_BEGIN_TIMESTAMP;

Review Comment:
   Basically, the only change here is when `beginInstantTime === <first 
commit>`, the `previousInstantTime` is `DEFAULT_BEGIN_TIMESTAMP` now whereas 
`<first commit>` before this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to