hudi-agent commented on code in PR #19023:
URL: https://github.com/apache/hudi/pull/19023#discussion_r3423875184


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/TransactionUtils.java:
##########
@@ -141,6 +145,28 @@ public static Option<Pair<HoodieInstant, Map<String, 
String>>> getLastCompletedT
     return getHoodieInstantAndMetaDataPair(metaClient, hoodieInstantOption);
   }
 
+  /**
+   * Get the last completed transaction hoodie instant before the given 
instant time.
+   * The returned instant has both requested time and completion time less 
than the given instant time,
+   * ensuring it was fully completed before the given instant was created.
+   *
+   * @param metaClient table meta client
+   * @param currentInstantTime the requested time of the current inflight 
instant
+   * @return the last completed instant before the given instant, with its 
extra metadata
+   */
+  public static Option<Pair<HoodieInstant, Map<String, String>>> 
getLastCompletedTxnInstantAndMetadata(
+      HoodieTableMetaClient metaClient, String currentInstantTime) {
+    Option<HoodieInstant> hoodieInstantOption = Option.fromJavaOptional(
+        metaClient.getActiveTimeline().getCommitsTimeline()
+            .filterCompletedInstants()
+            .findInstantsBefore(currentInstantTime)
+            .getInstantsAsStream()
+            .filter(instant -> instant.getCompletionTime() != null
+                && compareTimestamps(instant.getCompletionTime(), LESSER_THAN, 
currentInstantTime))

Review Comment:
   🤖 The V8+ `SimpleConcurrentFileWritesConflictResolutionStrategy` uses 
`findInstantsAfter(lastSuccessful.requestedTime())` as its candidate-set 
cutoff, so picking max-by-completion-time can still leave already-completed 
instants in the candidate set. E.g. slow streaming commit A (req=T1, comp=T5) 
and fast clustering B (req=T2>T1, comp=T3<T5) both predate `currentInstantTime` 
— this picks A, and `findInstantsAfter(T1)` still includes B, potentially 
re-triggering the same false-conflict the PR is trying to fix. Would 
`Comparator.comparing(HoodieInstant::requestedTime)` be safer here (also 
matching the existing single-arg `lastInstant()` semantics)?
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to