lhotari commented on PR #25706:
URL: https://github.com/apache/pulsar/pull/25706#issuecomment-4406950463

   > Based on the discussion so far, I've restructured the **Alternatives** 
section of the PIP. The key change is separating the analysis into distinct 
layers:
   > 
   > 1. **Problem Classification (P1–P5)**: First, explicitly enumerate the 
cascading problems that arise when a Consumer is stuck under Key_Shared, and 
classify which ones this PIP **must** solve versus which can be mitigated 
through other means. I believe we should align on this before debating 
solutions — not every problem needs to be solved in this PIP, and a scheme 
should not be rejected simply because it doesn't address secondary concerns.
   
   This is helpful.  I guess this is also about drilling down deeper into the 
actual problem and causes.
   
   > 2. **Core Problem Resolution Table**: A side-by-side comparison of all 
alternatives (Overflow ML + A through E) against each problem, making it clear 
what each scheme does and does not solve.
   
   Describing tradeoffs is useful. Expressing them in a table can be 
challenging, since explaining what each decision is trading off often requires 
more detail than a table can hold. That said, a table can still work as a 
high-level summary.
   Some characteristics of tradeoffs, such as "complexity", can be highly 
opinionated. One person's "complexity" might mean something quite different to 
someone else. In many cases, complexity is just anything unfamiliar, which 
isn't a sound way to qualify designs. In distributed systems, anything that 
increases the number of distributed interactions adds real complexity to the 
solution.
   
   > 3. **Cost Table**: What each alternative introduces in terms of extra 
storage, I/O, sustained hot-key viability, and client transparency.
   
   The impact on reliability and the risk of incidents are also important 
factors to weigh as part of "cost".
   
   > 4. **Summary**: A concise evaluation of why Overflow ML and Auxiliary 
Cursor are the only two schemes that address both P1 and P2, and the trade-offs 
between them.
   
   Deciding on the solution too early has always been a problem with the Pulsar 
PIP process. Ideally, discussions and interactions would happen in the early 
stages, while everyone is still open to different solutions. Otherwise, it 
easily turns into a situation where people feel they need to defend their point 
of view, and the synergy that a team can find together isn't leveraged. What 
could as a result of this that a real problem doesn't get solved at all due to 
lack of consensus and initially motivated contributors turning away.
   
   Every decision and solution is a tradeoff. Making decisions reversible or 
solutions optional is extremely useful to reduce risk and avoiding increased 
maintenance overhead.
   
   Instead of adding more conditionals to the existing 
PersistentDispatcherMultipleConsumers/PersistentStickyKeyDispatcherMultipleConsumers
 classes, it would be preferable to introduce method hooks so that another, 
more specialized implementation can plug in. That would avoid piling more 
maintenance overhead onto the existing classes when a special-case 
implementation such as the overflow ML is added to the codebase. Another 
possibility is to add configuration so that a third-party implementation can be 
used for Key_Shared or Shared subscription dispatchers.
   
   One way for you to make progress regardless of what gets decided about 
PIP-474 is to add this possibility — refactor the dispatcher classes with hooks 
so that the implementation can be extended, and add configuration so that the 
broker can use a third-party dispatcher implementation. Such a change could be 
handled in a separate PIP, and it would also be useful for other use cases. 
This type of solution would also be a good fit for implementing PIP-474 as a 
separate dispatcher implementation initially and have it later be merged into 
Pulsar when it's a proven solution.
   We could possibly also backport the "pluggable dispatcher" solution to 
maintenance releases since the risk of it causing regressions is relatively low.
   That way you could solve your problem about hot/slow keys as soon as 
possible and gain real production feedback of the implementation.
   
   > This structure also has a practical benefit for PR review: each point 
(problem classification, per-problem resolution, cost dimension) can be 
discussed independently in its own comment thread, rather than getting lost in 
a single long-form reply.
   > 
   > The updated PIP is ready for review. I'd suggest we start by aligning on 
the problem classification (the P1–P5 table) — once we agree on what must be 
solved, the solution comparison follows naturally
   
   Makes sense. I'd just suggest taking a look at the "pluggable dispatcher" 
idea and focusing on that first in a separate PIP. That would let you make 
progress quickly without getting stalled by the decisions in this PIP. The 
"pluggable dispatcher" could also be considered a prerequisite for this PIP. 
Besides the dispatcher itself, there might be a need to extend the interface so 
that the dispatcher can expose implementation-specific statistics or metrics. 
It might also touch other areas, which could broaden the scope enough to 
support something like the PIP-474 implementation.
   
   Covering "overflow ML" could require several other plugin interfaces so that 
the lifecycle of the ML can be tied to the subscription and the topic. Adding 
such interfaces deserves more attention. Coupling the "overflow ML" directly to 
other internals wouldn't be a great idea. Pulsar has multiple listener 
interfaces, but they likely don't cover every need. One way to plug into Pulsar 
today is by implementing a TopicFactory (configurable via 
topicFactoryClassName). Implementing a TopicFactory and creating a custom 
subclass of PersistentTopic is one way to hook into the required extension 
points from custom plugins. Many hooks might be missing for easily extending up 
to the broker side Consumer level if that happens to be required.
   A simpler approach would obviously be to have lifecycle interfaces, so that 
it wouldn't be necessary to go through the pain of implementing a TopicFactory, 
a custom PersistentTopic, and so on, just to get notified about topic and 
subscription deletions so the "overflow ML" can be cleaned up.
   
   Looking forward to your opinion about first focusing on the "pluggable 
dispatcher" part.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to