Hello Asaf, thank  you for taking a look at this. I will have a formal PIP
sometime by March end. Trying to close on the Rate limiting PIPs first.

On Sun, Feb 18, 2024 at 3:47 PM Asaf Mesika <asaf.mes...@gmail.com> wrote:

> Hey Girish,
>
> First, I say that I *love* this proposal and, in general, those types of
> proposals.
> This is what strides Pulsar towards being an even more next-generation
> messaging system.
>
> I read and have a few questions and brainstorming ideas popping into my
> mind:
>
> 1. The current design basically says: Let’s have a read-only toggle (flag)
> for each partition. When I decrease the partitions from, say, 2 to 1, then
> if the partitions were “billing-0” and “billing-1”, now “billing-1” will be
> marked read-only, and eventually, the client will only produce messages to
> “billing-0”. After 1 hour, I can scale it back to 2 partitions, and now the
> “billing-“1 will be toggled back to read-only=false.
>

This is true. But probably its only extension of a problem that already
exists today - In case you scale up a 3 day retention topic from 2 to 3
partitions and start a new subscription from the beginning, you will see
drastic time difference in the messages of the older partitions vs the
newer ones.



>
> * I know you stated that ordered consumption is out of scope. The thing I
> fear here is that even for shared subscriptions, in which order doesn’t
> matter, it still feels a bit weird that when you consume from the
> beginning, you can suddenly consume messages that are 1 hour apart from
> each other, one after another. Something like:
>
> P0  | t1 | t3 | t7 | t10| t11| t13| t17|
>     +----+----+----+----+----+----+----+
> P1  | t2 | t4 | t6 | t9 | t12| t14| t16|
>     +----+----+----+----+----+----+----+
> P2  |    |    | t5 | t8 |    |    | t15|
>     |    |    |    |    |    |    |    |
> ----+----+----+----+----+----+----+----+
>                         ^          ^
>                         RO         URO
>
>
> t5 - you scaled to 3 partitions.
> “R0” is when you change from 3 partitions to 2
> “URO” is when you change back to 3 partitions.
>
> When you consume this partitioned topic from the beginning, you will
> consume t15 mixed with t6 and t7, which can be hours apart.
>

Even if the messages are hours apart, they are still confined to the
ordering guarantees of a topic i.e. order is maintained within a partition
:)


>
> I understand this can happen today if you only add a partition and read
> from the beginning.
>

exactly! Maybe there is a need to solve this, maybe not as even kafka has
similar behavior. Although I am unaware if they are having discussions to
do something about it.


> 2. If we keep ordered consumption out of scope, how do we keep the users
> from doing “wrong” things, like using failover type subscriptions on
> partitioned topics that have decreased their partitions? Topic and its
> partition count is a detached “entity” from its consumption type.
>
>
This will be a very easy proposal to do a live check in the topic update
command. If there are exclusive/failover subscriptions attached to the
topic, then we prevent this. We should actually do this today as well as
the issue exists during partition count increase as well.


>
> I’m curious if you thought of implementing it following the pattern we have
> today for BK. When an ensemble changes, it simply adds the new ensemble to
> a list of ensembles, so you follow a chain of servers when you read from a
> ledger. You read from (b1,b2,b3) and then switch to (b1, b3, b5).
>
> What if a partitioned topic is exactly that? It is a chain of lists. Each
> list contains the topics (partitions).
> Something like:
> (billing-0-100, billing-1-101), (billing-0-102, billing-1-103,
> billing-2-104), (billing-0-105, billing-1-106)
>
> It’s only a direction - just wondering if something like that has been
> considered.
>
I believe this will be a very drastic change. I haven't looked in this
direction, but this will touch almost every aspect of the broker - from
dedupe, to transactions and beyond. I think almost all of the broker level
feature rely on the fact that a partition will always be owned by a single
topic at any given time. This will lead to an active partition for a single
partition across brokers..


-- 
Girish Sharma

Reply via email to