lhotari opened a new issue, #23921: URL: https://github.com/apache/pulsar/issues/23921
### Search before asking - [x] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Read release policy - [x] I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker. ### Version master branch code analysis ### Minimal reproduce step There's currently an issue that the org.apache.pulsar.broker.service.ServerCnx#completedSendOperation might not get called in error cases. The impact of this is that message publishing could stop for all connections using a particular IO thread. The broker `maxMessagePublishBufferSizeInMB` limit is split into a `maxPendingBytesPerThread` limit: https://github.com/apache/pulsar/blob/3fce3097c76a9c8cb64cf3d8d87f6e050e6cb3a5/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L342-L343 The pending bytes is incremented in sending: https://github.com/apache/pulsar/blob/3fce3097c76a9c8cb64cf3d8d87f6e050e6cb3a5/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L3357 It is decremented in ServerCnx#completedSendOperation method: https://github.com/apache/pulsar/blob/3fce3097c76a9c8cb64cf3d8d87f6e050e6cb3a5/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L3376-L3377 If the call to decrement is missing, there will be a leak which will eventually cause all message publishing to stop for all connections using a particular IO thread. **The leak happens here**: https://github.com/apache/pulsar/blob/2a9d4ac85d8d786979afaa0b965cdb27375ae969/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L732-L749 There should be a call to MessagePublishContext#completed for all exception cases. ServerCnx#completedSendOperation gets called for exception path in MessagePublishContext#completed here: https://github.com/apache/pulsar/blob/3d0625ba64294fb0fe7dafc27c7a34883b4be51b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/Producer.java#L480-L499 The other exception cases contain the required call to `callback.completed` which will call ServerCnx#completedSendOperation: https://github.com/apache/pulsar/blob/2a9d4ac85d8d786979afaa0b965cdb27375ae969/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L776-L794 ### What did you expect to see? There shouldn't be a leak in `maxPendingBytesPerThread` permits which eventually leads to message publishing stopping for all connections using a particular IO thread. ### What did you see instead? Based on the analysis of the code, there's a leak. ### Anything else? This might be related to issue #23920 ### Are you willing to submit a PR? - [x] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
