poorbarcode opened a new pull request, #21498:
URL: https://github.com/apache/pulsar/pull/21498

   ### Motivation & Modifications
   
   Add a new test used to verify that the Broker will not leave an orphan 
consumer in the scenario below:
   - Register "consumer-1"
     - "consumer-1" will be maintained by the Subscription.
     - "consumer-1" will be maintained by the Dispatcher.
   - The connection of "consumer-1" has something wrong. We call this 
connection "connection-1"
   - Try to register "consumer-2"
     - "consumer-2" will be maintained by the Subscription. At this time, there 
are two consumers under this
        subscription.
     - This will trigger a connection check task for connection-1, we call this 
task "CheckConnectionLiveness".
        This task will be executed in another thread, which means it will 
release the lock `Synchronized(dispatcher)`
     - "consumer-2" was not maintained by the Dispatcher yet.
   - "CheckConnectionLiveness" will kick out "consumer-1" after 5 seconds, then 
"consumer-2" will be maintained
      by the Dispatcher.
   
   ---
   
   (Highlight) Race condition: if the connection of "consumer-2" went to a 
wrong state before step 4,
     "consumer-2" is maintained by the Subscription and not maintained by the 
Dispatcher. Would the scenario below
     will happen?
   - "connection-2" closed.
   - Remove "consumer-2" from the Subscription.
   - Try to remove "consumer-2" from the Dispatcher, but there are no consumers 
under this Dispatcher. To remove
        nothing.
   - "CheckConnectionLiveness" is finished; put "consumer-2" into the 
Dispatcher.
   - At this moment, the consumer's state of Subscription and Dispatcher are 
not consistent. There is an orphan
        consumer under the Dispatcher.
   
   ---
   
   ### Why there is no orphan consumer?
   There are two mechanisms to avoid orphan consumers:
   - After the consumer future is complete, it rechecks the connection is still 
active. Removed consumers again if the connection is inactive. see 
https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L919-L921
   
   **PersistentTopic.java**
   ```java
   if (!cnx.isActive()) {
       try {
           consumer.close();
       } 
   ...
   }
   ```
   
   - If the method `consumerFuture.complet(v)` returned `false`, broker will 
remove the consumer again. see 
https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1271-L1285
   
   ```java
   if (consumerFuture.complete(consumer)) {
       ...
   } else {
       consumer.close();
   }
   ```
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update 
later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   
   ### Matching PR in forked repository
   
   PR in forked repository: x


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to