RongtongJin opened a new issue, #10494: URL: https://github.com/apache/rocketmq/issues/10494
### Problem `HATest.testSemiSyncReplica` can be flaky with: ```text expected:<PUT_OK> but was:<FLUSH_SLAVE_TIMEOUT> ``` The test setup waits until the slave-side HA client enters `TRANSFER`, then immediately starts semi-sync writes. Entering `TRANSFER` only proves the slave has connected locally. The master-side `HAConnection` may not have received the slave's initial offset report yet, leaving `slaveAckOffset` at `-1` during the first synchronous replication request. ### Impact On slower or busy CI machines, the first `asyncPutMessage` can race the initial slave ack report and time out even though the HA connection is otherwise healthy. ### Proposed fix Make the test wait for the actual readiness condition needed by semi-sync replication: the master-side HA connection is in `TRANSFER` and its `slaveAckOffset` has caught up to the slave's current max physical offset before sending messages. ### Validation Ran locally with Maven 3.9.9: ```bash mvn -pl store -am -Dtest=HATest#testSemiSyncReplica -DskipITs -DfailIfNoTests=false test mvn -pl store -am -Dtest=HATest -DskipITs -DfailIfNoTests=false test ``` The full `HATest` run reported `Tests run: 4, Failures: 0, Errors: 0, Skipped: 1`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
