Copilot commented on code in PR #364:
URL: 
https://github.com/apache/doris-spark-connector/pull/364#discussion_r3464115591


##########
spark-doris-connector/spark-doris-connector-base/src/main/java/org/apache/doris/spark/client/write/AbstractStreamLoadProcessor.java:
##########
@@ -177,11 +181,15 @@ public String stop() throws Exception {
                     throw new StreamLoadException("response is null");
                 }
             } catch (Exception e) {
+                // Batch failed: keep the current label so the retry reuses it 
(dedup-safe).
+                labelManager.onBatchFailed();
                 if (unexpectedException != null) {

Review Comment:
   `labelManager.onBatchFailed()` is executed for *all* failures of 
`requestFuture.get()`. If the failure was a fresh-label collision (`Status: 
Label Already Exists`) rather than an unknown-commit retry scenario, this will 
incorrectly force the next attempt to reuse the colliding label. On the 
subsequent retry, `isAlreadyCommitted(labelReused=true, Status=Label Already 
Exists, ExistingJobStatus=FINISHED)` will be treated as success, which can 
silently drop the batch’s rows (the FINISHED job may belong to a different 
load).
   
   Consider distinguishing "unknown commit outcome" failures (e.g., 
IO/interrupt/lost response) from definite fresh-label collisions. For 
fresh-label collisions, the retry should mint a new label (do not enable 
reuse), and `Label Already Exists` must remain an error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to