oneby-wang opened a new pull request, #25889:
URL: https://github.com/apache/pulsar/pull/25889
### Motivation
`PulsarFunctionTlsTest.testFunctionsCreation` is flaky when the two function
workers are in a leadership switchover window. The test can observe a
worker/leader state that is already stale by the time the create-function
request is internally forwarded to `/admin/v3/functions/leader/...`.
In that window, the coordination topic can already point to the new leader
while the new leader is still finishing initialization. The forwarded request
then fails with a transient `HTTP 503` response: `Leader not yet ready. Please
retry again`.
Reproduction log snippet:
```text
2026-05-29T22:21:35,931 - INFO -
[assignment-tailer-thread:FunctionAssignmentTailer] - assignment tailer thread
exiting {}
2026-05-29T22:21:35,931 - INFO -
[pulsar-external-listener-5215-1:FunctionAssignmentTailer] - Closing function
assignment tailer {}
2026-05-29T22:21:35,932 - INFO -
[pulsar-external-listener-5215-1:FunctionMetaDataManager] -
FunctionMetaDataManager becoming leader by creating exclusive producer {}
...
2026-05-29T22:21:36,011 - INFO -
[pulsar-web-5184-20:JettyRequestLogFactory] - HTTP request {bytesOut=53,
clientAddr=127.0.0.1, clientPort=52607, durationMs=3, method=PUT,
proto=HTTP/1.1, referer=null, status=503,
uri=https://localhost:52588/admin/v3/functions/leader/my-tenant/my-ns/function-0,
user=null, userAgent=Pulsar-Java-v5.0.0-M1-SNAPSHOT}
2026-05-29T22:21:36,012 - ERROR - [pulsar-web-5006-16:ComponentImpl] -
Failed to update function on leader {error=Update Failed}
org.apache.pulsar.client.admin.PulsarAdminException$ServerSideErrorException:
Leader not yet ready. Please retry again
at
org.apache.pulsar.client.admin.PulsarAdminException.wrap(PulsarAdminException.java:252)
at
org.apache.pulsar.client.admin.internal.BaseResource.sync(BaseResource.java:366)
at
org.apache.pulsar.client.admin.internal.FunctionsImpl.updateOnWorkerLeader(FunctionsImpl.java:706)
```
### Modifications
- Remove the test-side pre-check of the worker leader state before function
creation.
- Retry `createFunctionWithUrl` only when the failure is a
`PulsarAdminException` with status code `503` and `Leader not yet ready` in the
HTTP error body.
- Keep all other admin failures visible immediately so TLS, auth,
validation, or non-transient service errors are not hidden.
### Verifying this change
This change is already covered by existing tests:
- `./gradlew :pulsar-broker:test --tests
org.apache.pulsar.functions.worker.PulsarFunctionTlsTest.testFunctionsCreation`
### Does this pull request potentially affect one of the following parts:
- [ ] Dependencies (add or upgrade a dependency)
- [ ] The public API
- [ ] The schema
- [ ] The default values of configurations
- [ ] The threading model
- [ ] The binary protocol
- [ ] The REST endpoints
- [ ] The admin CLI options
- [ ] The metrics
- [ ] Anything that affects deployment
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]