xyueji opened a new pull request, #10680:
URL: https://github.com/apache/seatunnel/pull/10680

   ### Purpose of this pull request
   
   Fix two issues in StarRocks sink connector when `stream_load` fails with 
"Label has already been used":
   
   **Bug 1 - Data loss in `StarRocksSinkManager`:** When a `stream_load` fails 
and StarRocks retains the label, the current code detects "Label has already 
been used" and `break`s out of the retry loop, **silently skipping the entire 
batch**. This causes data loss because the data was never actually loaded into 
StarRocks.
   
   **Fix:** Replace `break` with generating a new label and `continue` to retry 
the batch with the new label.
   
   **Bug 2 - Missing `reCreateLabel` flag in `StarRocksStreamLoadVisitor`:** 
When `doStreamLoad()` returns Status="Fail" with a message containing "has 
already been used", the thrown `StarRocksConnectorException` does not set 
`reCreateLabel=true`. This prevents the retry logic from generating a new label 
via the `needReCreateLabel()` path.
   
   **Fix:** Detect "has already been used" in the failure message and pass 
`reCreateLabel=true` to the exception constructor.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. This is a bug fix for an internal retry mechanism. Users will see that 
`stream_load` retries now succeed (with a new label) instead of either looping 
infinitely or silently losing data.
   
   ### How was this patch tested?
   
   Manually tested by reproducing the scenario where StarRocks retains a failed 
label. After the fix, the retry generates a new label and completes 
successfully without data loss.
   
   ### Check list
   
   * [x] If any new Jar binary package adding in your PR, please add License 
Notice according [New License 
Guide](https://github.com/apache/seatunnel/blob/dev/docs/en/contribution/new-license.md)
   * [x] If necessary, please update the documentation to describe the new 
feature. https://github.com/apache/seatunnel/tree/dev/docs
   * [ ] If you are contributing the connector code, please check that the 
following files are updated:
     1. Update 
[plugin-mapping.properties](https://github.com/apache/seatunnel/blob/dev/plugin-mapping.properties)
 and add new connector information in it
     2. Update the pom file of 
[seatunnel-dist](https://github.com/apache/seatunnel/blob/dev/seatunnel-dist/pom.xml)
     3. Add ci label in 
[label-scope-conf](https://github.com/apache/seatunnel/blob/dev/.github/workflows/labeler/label-scope-conf.yml)
     4. Add e2e testcase in 
[seatunnel-e2e](https://github.com/apache/seatunnel/tree/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/)
     5. Update connector 
[plugin_config](https://github.com/apache/seatunnel/blob/dev/config/plugin_config)
   * [ ] Update the 
[`release-note`](https://github.com/apache/seatunnel/blob/dev/release-note.md).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to