massakam opened a new pull request #7096:
URL: https://github.com/apache/pulsar/pull/7096


   Master Issue: #7041
   
   ### Motivation
   
   When a leader broker is restarted, some producers for topics owned by that 
broker may not be reopened on the new broker. When this happens, message 
publishing will continue to fail until the client application is restarted.
   
   As a result of the investigation, I found that lookup requests sent by the 
producers in question are redirected more than 10,000 times between multiple 
brokers.
   
   When a lookup request is redirected, `BinaryProtoLookupService#findBroker()` 
is called recursively. Therefore, tens of thousands of redirects will cause 
`StackOverflowError` and `BinaryProtoLookupService#findBroker()` will never 
complete.
   
   ### Modifications
   
   Limit the number of times a lookup is redirected to 100. This maximum is 
user configurable. If the number of redirects exceeds 100, the lookup will 
fail. But `ConnectionHandler` retries lookup so that the producer can 
eventually reconnect to the new broker.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to