Hey Girish,

First, I say that I *love* this proposal and, in general, those types of
proposals.
This is what strides Pulsar towards being an even more next-generation
messaging system.

I read and have a few questions and brainstorming ideas popping into my
mind:

1. The current design basically says: Let’s have a read-only toggle (flag)
for each partition. When I decrease the partitions from, say, 2 to 1, then
if the partitions were “billing-0” and “billing-1”, now “billing-1” will be
marked read-only, and eventually, the client will only produce messages to
“billing-0”. After 1 hour, I can scale it back to 2 partitions, and now the
“billing-“1 will be toggled back to read-only=false.

* I know you stated that ordered consumption is out of scope. The thing I
fear here is that even for shared subscriptions, in which order doesn’t
matter, it still feels a bit weird that when you consume from the
beginning, you can suddenly consume messages that are 1 hour apart from
each other, one after another. Something like:

P0  | t1 | t3 | t7 | t10| t11| t13| t17|
    +----+----+----+----+----+----+----+
P1  | t2 | t4 | t6 | t9 | t12| t14| t16|
    +----+----+----+----+----+----+----+
P2  |    |    | t5 | t8 |    |    | t15|
    |    |    |    |    |    |    |    |
----+----+----+----+----+----+----+----+
                        ^          ^
                        RO         URO


t5 - you scaled to 3 partitions.
“R0” is when you change from 3 partitions to 2
“URO” is when you change back to 3 partitions.

When you consume this partitioned topic from the beginning, you will
consume t15 mixed with t6 and t7, which can be hours apart.

I understand this can happen today if you only add a partition and read
from the beginning.

2. If we keep ordered consumption out of scope, how do we keep the users
from doing “wrong” things, like using failover type subscriptions on
partitioned topics that have decreased their partitions? Topic and its
partition count is a detached “entity” from its consumption type.


I’m curious if you thought of implementing it following the pattern we have
today for BK. When an ensemble changes, it simply adds the new ensemble to
a list of ensembles, so you follow a chain of servers when you read from a
ledger. You read from (b1,b2,b3) and then switch to (b1, b3, b5).

What if a partitioned topic is exactly that? It is a chain of lists. Each
list contains the topics (partitions).
Something like:
(billing-0-100, billing-1-101), (billing-0-102, billing-1-103,
billing-2-104), (billing-0-105, billing-1-106)

It’s only a direction - just wondering if something like that has been
considered.


On Fri, Jan 19, 2024 at 8:28 AM Girish Sharma <scrapmachi...@gmail.com>
wrote:

> Hello everyone,
>
> A a true cloud native platform, which supports scale up and scale down, I
> feel like there is a need to be able to reduce partition count in pulsar to
> truly achieve a scale down after events like sales (akin to black friday,
> etc) or huge temporary publish burst due to backfill.
>
> I looked through the archives (upto 2021) and did not find any prior
> discussion on the same topic.
>
> I have given this an initial thought to figure out what would it need to
> support such a feature in the lowest footprint possible. I am attaching the
> document explaining the need, requirements and initial high level details
> [0]. What I would like is to understand if the community also finds this
> feature helpful and does the approach described in the document have some
> fatal flaw? Summarizing the approach here as well:
>
>    - Introduce an ability to convert a normal topic object into a read-only
>    topic via admin api and an additional partitioned-topic metadata
> property
>    (just like shadow source, etc)
>    - Add logic to block produce but allow new consumers and dispatch call
>    based on this flag
>    - Add logic in GC to clean out read only topics when all of their
>    ledgers expire (TTL/retention)
>
> Goal is that there is no data movement involved and no impact on existing
> partitions during this scale down.
>
> Looking forward to the discussion.
>
> [0]
>
> https://docs.google.com/document/d/1sbGQSwDihQftIRsxAXg5Zm4uxKQ0kRk9HadKYRFTswI/edit?usp=sharing
>
> Regards
> --
> Girish Sharma
>

Reply via email to