TakaHiR07 opened a new pull request, #23551: URL: https://github.com/apache/pulsar/pull/23551
Fixes https://github.com/apache/pulsar/issues/23550 ### Motivation After diving into the code, finding that there is a concurrent error in TransactionBufferHandlerImpl#checkRequestCredits(), checkPendingRequests(), which would cause the above issue. Currently, we have config TransactionBufferClientMaxConcurrentRequests to control the concurrent request number. However, if the request and response is executed as follow, the request would permanently stuck in queue. (to simplify the case, let's set permit is 1) | step | request-1 | request-2 | response-1 | | ----------- | ----------- | ----------- | ----------- | | 1 | start do checkRequestCredits() | | | | 2 | compareAndSet requestCredits to 0 | | | | 3 | execute endTxn | | | | 4 | | start do checkRequestCredits() | | | 5 | | get currentPermit = 0 | | | 6 | | | trigger onResponse(), set requestCredits to 1| | 7 | | | trigger checkPendingRequests(), permit == 1 && pendingRequests is null, so break the while process | | 8 | | currentPermits == 0 && pendingRequest is null, then add op to pendingRequest | | Now we can find there is no response can trigger pendingRequest.remove, and then all the new requests just add to pendingRequest but permanently not execute. ### Modifications The root reason is currently only onResponse() can trigger pendingRequest.remove. But when we execute onResponse(), the requestOp may not have been added to pendingRequest. - So one modification is to let it can check the pendingRequest queue in checkRequestCredits() - And the while(true) in checkPendingRequests() is not necessary, 1 response come back, take 1 requestOp from pendingRequest is OK. It is hard to add test for this concurrent case. ### Verifying this change - [ ] Make sure that the change passes the CI checks. ### Does this pull request potentially affect one of the following parts: <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> *If the box was checked, please highlight the changes* - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] The metrics - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: <!-- ENTER URL HERE --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
