GitHub user vishnumurthy-nd created a discussion: FIFO guarantees with 
Key_Shared subscriptions when scaling partitions and clearing backlog

I am relatively new to Apache Pulsar and am experimenting to understand its 
behavior under some real-world scaling scenarios. I would really appreciate 
guidance to validate whether my understanding and design approach is correct.

Requirements / Constraints:
1. Multi-tenant system where tenants (smart meter providers) are dynamically 
onboarded.
2. Each tenant has devices ranging from 10 to 1M, and device count can increase 
over time.
3. Strict FIFO ordering per device (key) must never be violated, even during 
downtime or scaling events.
4. System must tolerate periods where producers are active but consumers are 
temporarily down, causing backlog accumulation.

Proposed Design:
1. One topic per tenant.
2. Topic is partitioned for throughput.
3. Messages are produced with deviceId as the message key.
4. Consumers use Key_Shared subscriptions to guarantee per-key ordering and 
allow consumer scalability.

Scenario I’m Trying to Understand
1. A tenant topic initially has N partitions (e.g., 2).
2. Producers continue writing data for existing deviceIds while no consumers 
are active, so backlog accumulates.
3. Due to higher ingress or future scale needs, the partition count is 
increased to N + M (e.g., from 2 → 4).
4. Later, consumers start (or scale up).

Questions:
1. After increasing the partition count
     - Will messages for existing deviceIds continue to be routed to their 
original partitions?
     - Or can existing keys be rehashed to newly added partitions?
2. When consumers start after downtime:
     - How does Pulsar merge backlog + live traffic while preserving strict 
FIFO per key?
     - Is backlog always drained in partition order before newer messages for 
the same key are delivered?
3. With Key_Shared subscriptions:
     - Is per-key ordering guaranteed even if consumers join after backlog has 
accumulated?
     - Are there any edge cases where unacked messages for a key can block or 
delay delivery of other keys?
4. From a design perspective:
     - Is increasing partition count on an existing partitioned topic 
compatible with strict FIFO per key?
     - Or is the recommended approach to fix partition count at topic creation 
time and scale only via consumers?

I want to ensure I am not misunderstanding any internal guarantees or relying 
on behavior that is not officially supported.

Thanks in advance for your help and for maintaining Pulsar

GitHub link: https://github.com/apache/pulsar/discussions/25131

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to