pdolif opened a new pull request, #24854: URL: https://github.com/apache/pulsar/pull/24854
Fixes #24837 ### Motivation The test is flaky. It verifies that lookup requests use the lookup properties configured at the time when the request is triggered (and not at the time it is executed). Therefore, it triggers 10 broker lookups in sequence and changes the lookup properties before each. The property "broker.id" is set to "key-0", "key-1",..., "key-9" respectively. To verify that the correct properties are used, the test uses the `BrokerIdAwareLoadManager`, which extends `ExtensibleLoadManagerImpl`. It stores a `CLIENT_ID_LIST`. When `assign()` is called, the "broker.id" property is added to the `CLIENT_ID_LIST`. The test waits until all lookup requests finish, and then asserts that `CLIENT_ID_LIST` is equal to "key-0", "key-1",..., "key-9". The assertion fails if the list contains the values in a different order. When running the tests, I noticed that the order is either "key-0", "key-1",..., "key-9" (ascending) or the reverse order. The reason seems to be related to CompletableFuture "callbacks" like `thenCompose()` and their execution order. For example, for each of the 10 lookup requests, `NamespaceService.getBrokerServiceUrlAsync()` is called (in sequence), which contains the following code: https://github.com/apache/pulsar/blob/461ffd84d641b855fcc0d09d586ff4b580c21974/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java#L225-L226 When `thenCompose()` is called, it can either be that the `getBundleAsync(topic)` CompletableFuture is completed or not. If it is completed, `thenCompose()` can be executed immediately for each of the 10 lookup requests. The test passes. If it is not completed, `thenCompose()` is called 10 times but not executed immediately. It seems like the "callbacks" are stacked. Once the CompletableFuture completes, the 10 "callbacks" are executed in LIFO order, i.e., the one for "key-9" is executed first. Therefore, the order in `CLIENT_ID_LIST` will be reversed, and the test fails. ### Modifications Make the assertion less strict and ignore the value order. ### Verifying this change - [x] Make sure that the change passes the CI checks. ### Does this pull request potentially affect one of the following parts: *If the box was checked, please highlight the changes* - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] The metrics - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: https://github.com/pdolif/pulsar/pull/17 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
