Copilot commented on code in PR #364:
URL:
https://github.com/apache/doris-spark-connector/pull/364#discussion_r3464115591
##########
spark-doris-connector/spark-doris-connector-base/src/main/java/org/apache/doris/spark/client/write/AbstractStreamLoadProcessor.java:
##########
@@ -177,11 +181,15 @@ public String stop() throws Exception {
throw new StreamLoadException("response is null");
}
} catch (Exception e) {
+ // Batch failed: keep the current label so the retry reuses it
(dedup-safe).
+ labelManager.onBatchFailed();
if (unexpectedException != null) {
Review Comment:
`labelManager.onBatchFailed()` is executed for *all* failures of
`requestFuture.get()`. If the failure was a fresh-label collision (`Status:
Label Already Exists`) rather than an unknown-commit retry scenario, this will
incorrectly force the next attempt to reuse the colliding label. On the
subsequent retry, `isAlreadyCommitted(labelReused=true, Status=Label Already
Exists, ExistingJobStatus=FINISHED)` will be treated as success, which can
silently drop the batch’s rows (the FINISHED job may belong to a different
load).
Consider distinguishing "unknown commit outcome" failures (e.g.,
IO/interrupt/lost response) from definite fresh-label collisions. For
fresh-label collisions, the retry should mint a new label (do not enable
reuse), and `Label Already Exists` must remain an error.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]