nodece opened a new pull request, #1502:
URL: https://github.com/apache/pulsar-client-go/pull/1502
## Motivation
The extensible-load-manager CI job timed out in
TestBlueGreenMigrationTestSuite/TestTopicMigration/proxyConnection after 5
minutes.
From the stack and logs, the test was stuck waiting on WaitGroup while
producer/consumer goroutines were still looping in retry paths. During
migration, producer can enter terminal states (for example TopicTerminated or
ProducerClosed), but the test retry loops had no terminal-exit logic, causing
effectively unbounded retries and suite timeout.
## Modifications
- Add terminal error handling in producer send retry loop:
- if error is ErrTopicTerminated or ErrProducerClosed, fail fast instead
of retrying forever.
- Add bounded retry windows for both producer and consumer loops (30 seconds
per message stage).
- Add an error channel and stage-level wait helper around WaitGroup waits to
fail early on goroutine errors.
- Add stage timeout protection while waiting for:
- pre-unload send/receive synchronization
- producer/consumer goroutine completion
- Keep per-iteration context cancellation immediate (cancel after each
send/receive attempt).
These changes make the test deterministic under migration failures and
prevent hanging until global test timeout.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]