poorbarcode opened a new pull request, #21498: URL: https://github.com/apache/pulsar/pull/21498
### Motivation & Modifications Add a new test used to verify that the Broker will not leave an orphan consumer in the scenario below: - Register "consumer-1" - "consumer-1" will be maintained by the Subscription. - "consumer-1" will be maintained by the Dispatcher. - The connection of "consumer-1" has something wrong. We call this connection "connection-1" - Try to register "consumer-2" - "consumer-2" will be maintained by the Subscription. At this time, there are two consumers under this subscription. - This will trigger a connection check task for connection-1, we call this task "CheckConnectionLiveness". This task will be executed in another thread, which means it will release the lock `Synchronized(dispatcher)` - "consumer-2" was not maintained by the Dispatcher yet. - "CheckConnectionLiveness" will kick out "consumer-1" after 5 seconds, then "consumer-2" will be maintained by the Dispatcher. --- (Highlight) Race condition: if the connection of "consumer-2" went to a wrong state before step 4, "consumer-2" is maintained by the Subscription and not maintained by the Dispatcher. Would the scenario below will happen? - "connection-2" closed. - Remove "consumer-2" from the Subscription. - Try to remove "consumer-2" from the Dispatcher, but there are no consumers under this Dispatcher. To remove nothing. - "CheckConnectionLiveness" is finished; put "consumer-2" into the Dispatcher. - At this moment, the consumer's state of Subscription and Dispatcher are not consistent. There is an orphan consumer under the Dispatcher. --- ### Why there is no orphan consumer? There are two mechanisms to avoid orphan consumers: - After the consumer future is complete, it rechecks the connection is still active. Removed consumers again if the connection is inactive. see https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L919-L921 **PersistentTopic.java** ```java if (!cnx.isActive()) { try { consumer.close(); } ... } ``` - If the method `consumerFuture.complet(v)` returned `false`, broker will remove the consumer again. see https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1271-L1285 ```java if (consumerFuture.complete(consumer)) { ... } else { consumer.close(); } ``` ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org