sijie commented on pull request #7506:
URL: https://github.com/apache/pulsar/pull/7506#issuecomment-656909319


   @merlimat it is stuck in LedgerRecoveryOp when recovering cursors. I haven't 
caught the real exception. My feeling is more coming from the zookeeper side. 
The chaos test we did is killing the Kubernetes worker node hardly. In that 
worker node, it has one zookeeper pod, one bookkeeper pod, and one broker pod. 
This sounds like causing some zookeeper call didn't come back and the ledger 
recovery op stuck without triggering any callback which in return causes the 
problem in managed ledger library.
   
   A side note - I created an issue a while ago to separate loading cursors 
from loading managed ledger. The idea is that we should allow producers to 
produce messages once the managed ledger is ready. This would improve write 
availability. https://github.com/apache/pulsar/issues/7404 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to