rdhabalia opened a new pull request #7964:
URL: https://github.com/apache/pulsar/pull/7964


   ### Motivation
   We have seen frequent instances where topic unloading stuck and we have to 
restart broker. Unloading topic gives 500 with below broker log
   ```
   01:26:19.691 [pulsar-web-35-6] INFO  
org.apache.pulsar.broker.admin.impl.PersistentTopicsBase - [admin] Unloading 
topic persistent://tenant/cluster/ns/topic-partition-15
   01:26:19.692 [pulsar-web-35-6] WARN  
org.apache.pulsar.broker.service.persistent.PersistentTopic - 
[persistent://tenant/cluster/ns/topic-partition-15] Topic is already being 
closed or deleted
   :
   01:26:19.692 [pulsar-web-35-6] ERROR 
org.apache.pulsar.broker.admin.impl.PersistentTopicsBase - [admin] Failed to 
unload topic persistent://tenant/cluster/ns/topic-partition-15, 
org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException: 
Topic is already fenced
   java.util.concurrent.ExecutionException: 
org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException: 
Topic is already fenced
           at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) 
~[?:?]
           at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) ~[?:?]
   ```
   
   It happens when first topic-unload stuck and unloading never completes. 
   
   After investigating broker heap-dump, we found out that  
[PersistentDispatcherMultipleConsumers invoked consumer disconnect but it 
couldn't 
remove](https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentDispatcherMultipleConsumers.java#L183)
 from the cached 
[consumerSet](https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentDispatcherMultipleConsumers.java#L181).
 
   We can also validate it in log where we can see consumer disconnect log but 
can't see log after dispatcher removes the consumer from cache.
   ```
   01:24:20.024 [pulsar-web-35-31] INFO  
org.apache.pulsar.broker.service.Consumer - Disconnecting consumer: 
Consumer{subscription=PersistentSubscription{topic=persistent://
   tenant/bf1/ns/topic-partition-15, name=sub1}, consumerId=15, 
consumerName=a36e0, address=/1.1.1.1:42808}
   01:24:20.024 [pulsar-web-35-31] INFO  
org.apache.pulsar.broker.service.Consumer - Disconnecting consumer: 
Consumer{subscription=PersistentSubscription{topic=persistent://
   tenant/bf1/ns/topic-partition-15, name=sub2}...
   ```
   
   Therefore, we are adding info log for future investigation if Dispatcher 
doesn't find consumer in set which should never happen.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to