Hello PengHui,
Thank you for the reply. Adding comments inline below with a few concerns.

On Mon, Jul 3, 2023 at 4:38 PM PengHui Li <peng...@apache.org> wrote:

> Hi Girish,
> Thanks for raising the discussion.
> I can confirm that your understanding is correct, and the document
> is confusing. If there are four consumers connected to a partitioned topic
> with two partitions, each partition will have four connected consumers but
> only one active consumer. The document said two consumers are connected
> to each partition is wrong. We will try to improve the document, and your
> contribution is welcome if you want to improve it.
> Yes, the part where it shows only 2 consumers are connected is misleading,
but from information point of view, it is still okay to show only 2 in the
visualization, as one is active and other one is backup (next in line)

The confusion comes where it tries to indicate that the active consumers
are uniformly spread. i.e. in the example, consumers A and C are active
while in reality, consumers A and B are active.
Maybe there is scope of visualization improvement there.

> For the consumer shift for the partition without active consumer failures.
> I think it should be a load-balance consideration. Kafka has a consumer
> group coordinator, which can balance traffic between consumers. But Pulsar
> doesn't have. So Pulsar has to re-assign the active consumer when the
> consumer
> leaves, no matter whether the consumer is active or not.

>From a code perspective, I do understand that it's tricky to ensure minimal
re-assignment.  But this should be highlighted in the documentation as it
has implications in terms of ordered consumption as described below.

> Frankly, it's not the best policy for all the cases. IMO, Pulsar also can
> have different
> policies for assigning active consumers for different requirements. Do you
> have
> a real case that the unnecessary consumer shift will impact? Which will
> help us to
> understand the value of introducing different policies. All I can think of
> at the moment
> are load balance (if the traffic of the partitions is far from each other)
> and the duplicated
> messages when switching the active consumer.

Right, so currently I do see these challenges:

   - Unlike KEY_SHARED, there is no logic to start sending data to newly
   assigned consumers *only after *the previous one acks to a certain
      - This in turn leads to chances of out of order consumption and
      duplicate consumption where the in queue messages of older consumers may
      still be processed while the same messages are sent to new consumers as
   - For any disconnected or newly added consumer, more than one partition
   gets affected based on the index of the consumer which got removed.
   - What is the use of setting a consumer priority anything below the
   highest. The code seems to only consider the highest priority consumers to
   spread active consumers, and ignore any consumer with priority set anything
   lower than the highest priority among the consumers. Which means those
   consumers would always sit idle until there is at least 1 consumer with
   higher priority. Example, if ten consumers (consumer priority 1 through 10)
   are connected to 10 partitions, all 10 partitions would only send data to
   just one of the consumers at any given time.


> Regards,
> Penghui
> On Mon, Jul 3, 2023 at 2:49 PM Girish Sharma <scrapmachi...@gmail.com>
> wrote:
> > Bumping this up. Would really like to discuss this in the community.
> >
> > Regards
> >
> > On Wed, Jun 28, 2023 at 11:49 PM Girish Sharma <scrapmachi...@gmail.com>
> > wrote:
> >
> > > Hi everyone, I am trying to understand the failover subscription logic
> a
> > > bit more in detail. Specifically, the doc
> > > <https://pulsar.apache.org/docs/3.0.x/concepts-messaging/#failover
> > >mention
> > > this part for partitioned topic:
> > >
> > >
> > >
> > > * If the number of partitions in a partitioned topic is less than the
> > > number of consumers:For example, in the diagram below, this partitioned
> > > topic has 2 partitions and there are 4 consumers.Each partition has 1
> > > active consumer and 1 stand-by consumer.*
> > >
> > >
> > >    - *For p0, consumer A is the master consumer, while consumer B would
> > >    be the next consumer in line to receive messages if consumer A is
> > >    disconnected.*
> > >    - *For p1, consumer C is the master consumer, while consumer D would
> > >    be the next consumer in line to receive messages if consumer C is
> > >    disconnected*.
> > >
> > > So, as per this, since all four (A,B,C,D) consumers make connection to
> > > both partitions p0 and p1, the consumers array size in
> > > AbstractDispatcherSingleActiveConsumer should be 4. Now based on the
> > > consumer index choosing logic spanning lines 126 - 130
> > > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractDispatcherSingleActiveConsumer.java#L126-L130
> > >
> > > , the consumer index assigned to p0 should be 0 (i.e. A) and to p1
> should
> > > be 1 (i.e. B) . I am assuming here that all 4 consumers have the same
> > > priority.Now consider consumer B getting disconnected. remaining
> consumer
> > > array == (A,C,D) . In this case, p1 will get a new consumer using
> logic 1
> > > % 3 = 1 index i.e. consumer C now. p0's consumer would remain same
> i.e. 0
> > > % 3 = 0 i.e. A.
> > > Now next consider that consumer A also goes down. remaining consumer
> > array
> > > == (C,D) In this case, p0 will get a new consumer -> 0%2 = 0 i.e.
> > > consumer C and p1 would now be shifted to 1%2 = 1 Consumer D . Even
> > > though p1's active consumer was untouched, p1 got a consumer shift.So I
> > > have couple of questions -
> > >
> > >    - Am I missing something? Is my understanding of logic correct?
> > >    - If yes, why does the doc say what it says? And why change p1's
> > >    consumer uselessly in above example
> > >
> > >
> > > Regards
> > > --
> > > Girish Sharma
> > >
> >
> >
> > --
> > Girish Sharma
> >

Girish Sharma

Reply via email to