hudi-agent commented on code in PR #19023:
URL: https://github.com/apache/hudi/pull/19023#discussion_r3423875184
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/TransactionUtils.java:
##########
@@ -141,6 +145,28 @@ public static Option<Pair<HoodieInstant, Map<String,
String>>> getLastCompletedT
return getHoodieInstantAndMetaDataPair(metaClient, hoodieInstantOption);
}
+ /**
+ * Get the last completed transaction hoodie instant before the given
instant time.
+ * The returned instant has both requested time and completion time less
than the given instant time,
+ * ensuring it was fully completed before the given instant was created.
+ *
+ * @param metaClient table meta client
+ * @param currentInstantTime the requested time of the current inflight
instant
+ * @return the last completed instant before the given instant, with its
extra metadata
+ */
+ public static Option<Pair<HoodieInstant, Map<String, String>>>
getLastCompletedTxnInstantAndMetadata(
+ HoodieTableMetaClient metaClient, String currentInstantTime) {
+ Option<HoodieInstant> hoodieInstantOption = Option.fromJavaOptional(
+ metaClient.getActiveTimeline().getCommitsTimeline()
+ .filterCompletedInstants()
+ .findInstantsBefore(currentInstantTime)
+ .getInstantsAsStream()
+ .filter(instant -> instant.getCompletionTime() != null
+ && compareTimestamps(instant.getCompletionTime(), LESSER_THAN,
currentInstantTime))
Review Comment:
🤖 The V8+ `SimpleConcurrentFileWritesConflictResolutionStrategy` uses
`findInstantsAfter(lastSuccessful.requestedTime())` as its candidate-set
cutoff, so picking max-by-completion-time can still leave already-completed
instants in the candidate set. E.g. slow streaming commit A (req=T1, comp=T5)
and fast clustering B (req=T2>T1, comp=T3<T5) both predate `currentInstantTime`
— this picks A, and `findInstantsAfter(T1)` still includes B, potentially
re-triggering the same false-conflict the PR is trying to fix. Would
`Comparator.comparing(HoodieInstant::requestedTime)` be safer here (also
matching the existing single-arg `lastInstant()` semantics)?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]