Re: [I] [Bug] Partial partition consumers silent after topic fencing [pulsar]

via GitHub Thu, 15 Jan 2026 23:50:58 -0800


nodece commented on issue #25146:
URL: https://github.com/apache/pulsar/issues/25146#issuecomment-3758623450


   @lhotari Thanks for pointing out the related issues! The fact that #21082 
and similar issues have been reported multiple times, along with the previous 
attempt in PR #20540, confirms this is a systemic problem rather than an 
isolated edge case.
   
   ## Root cause
   
   The fundamental issue is the reliance on one-way broker-to-consumer 
notifications. During topic unloading (as demonstrated in the test scenario), 
there's a critical timing window where:
   
   - Broker-side state transitions rapidly: fencing → unloading → close 
consumer → cleanup cache
   - The close notification can be lost, consumer connections may not receive 
the notification before being closed, leaving them in a "zombie" state
   
   ## Proposed Solution
   
   Instead of fixing the complex broker notification timing issues, a more 
robust approach would be:
   
   ### Add a heartbeat mechanism at the consumer level.
   
   This would allow consumers to:
   
   - Proactively detect subscription issues
   - Auto-recover from "zombie" states without waiting for broker notifications
   - Handle topic transfers and unloading gracefully
   
   ### Trade-offs
   
   Disadvantage: This introduces additional overhead (network traffic, CPU 
cycles on both broker and consumer sides).
   
   Mitigation:
   
   - Make heartbeat interval configurable (e.g., 30s-60s by default)
   - Only enable for critical consumers or make it opt-in
   - Piggyback heartbeat with existing traffic when possible
   
   This aligns with distributed systems best practices where active health 
checking is more reliable than passive notification, though it comes at a 
performance cost.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Bug] Partial partition consumers silent after topic fencing [pulsar]

Reply via email to