liangyepianzhou commented on PR #25706: URL: https://github.com/apache/pulsar/pull/25706#issuecomment-4398007309
Thank you for your feedback. I'll keep my response brief to avoid making the thread too long for others to review and share their opinions. **Alt 2** does not solve any of the core problems — it only delays their onset. The argument "if it survives the peak, it's enough" does not hold for sustained hot key scenarios. In production, we cannot bet that "sustained slow-consumption hot keys will never occur," especially for critical business workloads. This would undermine users' confidence in Pulsar. **Alt 1** and **Overflow ML** both solve the core problems. Alt 1's cost is storage amplification (the auxiliary cursor's mark-delete cannot advance → entire ledgers are retained) + longer broker restart recovery time (the auxiliary cursor must replay from far behind). Overflow ML's cost is a secondary BK write for hot key data. **Alt 3** is the weakest — the victim (stuck consumer) cannot self-rescue. I still lean toward **Overflow ML** because it addresses all the core problems. Its cost — writing hot key data to a secondary BK ledger — can be managed through disk capacity planning and expansion. Hot keys may persist for a long time, but their data volume is typically a small fraction of total traffic. Meanwhile, its backlog and consumption progress metrics remain clean and straightforward for operators. Alt 1's storage amplification can also be addressed via disk expansion, but the longer broker restart recovery time is a harder trade-off. Alt 1 could also expose additional metrics for backlog and consumption progress, but that seems more complex and less user-friendly. Overall, Overflow ML provides cleaner operational visibility. Looking forward to hearing more voices and feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
