I filed this jira, fwiw:  https://issues.apache.org/jira/browse/KAFKA-2172

Jason

On Mon, Mar 23, 2015 at 2:44 PM, Jiangjie Qin <j...@linkedin.com.invalid>
wrote:

> Hi Jason,
>
> Yes, I agree the restriction makes the usage of round-robin less flexible.
> I think the focus of round-robin strategy is workload balance. If
> different consumers are consuming from different topics, it is unbalanced
> by nature. In that case, is it possible that you use different consumer
> group for different sets of topics?
> The rolling update is a good point. If you do rolling bounce in a small
> window, the rebalance retry should handle it. But if you want to canary a
> new topic setting on one consumer for some time, it won’t work.
> Could you maybe share the use case with more detail? So we can see if
> there is any workaround.
>
> Jiangjie (Becket) Qin
>
> On 3/22/15, 10:04 AM, "Jason Rosenberg" <j...@squareup.com> wrote:
>
> >Jiangjie,
> >
> >Yeah, I welcome the round-robin strategy, as the 'range' strategy ('til
> >now
> >the only one available), is not always good at balancing partitions, as
> >you
> >observed above.
> >
> >The main thing I'm bringing up in this thread though is the question of
> >why
> >there needs to be a restriction to having a homogenous set of consumers in
> >the group being balanced.  This is not a requirement for the range
> >algorithm, but is for the roundrobin algorithm.  So, I'm just wanting to
> >understand why there's that limitation.  (And sadly, in our case, we do
> >have heterogenous consumers using the same groupid, so we can't easily
> >turn
> >on roundrobin at the moment, without some effort :) ).
> >
> >I can see that it does simplify the implementation to have that
> >limitation,
> >but I'm just wondering if there's anything fundamental that would prevent
> >an implementation that works over heterogenous consumers.  E.g. "Lay out
> >all partitions, and layout all consumer threads, and proceed round robin
> >assigning each partition to the next consumer thread. *If the next
> >consumer
> >thread doesn't have a selection for the current partition, then move on to
> >the next consumer-thread...."*
> >
> >The current implementation is also problematic if you are doing a rolling
> >restart of a consumer cluster.  Let's say you are updating the topic
> >selection as part of an update to the cluster.  Once the first node is
> >updated, the entire cluster will no longer be homogenous until the last
> >node is updated, which means you will have a temporary outage consuming
> >data until all nodes have been updated.  So, it makes it difficult to do
> >rolling restarts, or canary updates on a subset of nodes, etc.
> >
> >Jason
> >
> >Jason
> >
> >On Fri, Mar 20, 2015 at 10:15 PM, Jiangjie Qin <j...@linkedin.com.invalid
> >
> >wrote:
> >
> >> Hi Jason,
> >>
> >> The motivation behind round robin is to better balance the consumers¹
> >> load. Imagine you have two topics each with two partitions. These topics
> >> are consumed by two consumers each with two consumer threads.
> >>
> >> The range assignment gives:
> >> T1-P1 -> C1-Thr1
> >> T1-P2 -> C1-Thr2
> >> T2-P1 -> C1-Thr1
> >> T2-P2 -> C1-Thr2
> >> Consumer 2 will not be consuming from any partitions.
> >>
> >> The round robin algorithm gives:
> >> T1-P1 -> C1-Thr1
> >> T1-P2 -> C1-Thr2
> >> T2-P1 -> C2-Thr1
> >> T2-p2 -> C2-Thr2
> >> It is much better than range assignment.
> >>
> >> That¹s the reason why we introduced round robin strategy even though it
> >> has restrictions.
> >>
> >> Jiangjie (Becket) Qin
> >>
> >>
> >> On 3/20/15, 12:20 PM, "Jason Rosenberg" <j...@squareup.com> wrote:
> >>
> >> >Jiangle,
> >> >
> >> >The error messages I got (and the config doc) do clearly state that the
> >> >number of threads per consumer must match also....
> >> >
> >> >I'm not convinced that an easy to understand algorithm would work fine
> >> >with
> >> >a heterogeneous set of selected topics between consumers.
> >> >
> >> >Jason
> >> >
> >> >On Thu, Mar 19, 2015 at 8:07 PM, Mayuresh Gharat
> >> ><gharatmayures...@gmail.com
> >> >> wrote:
> >> >
> >> >> Hi Becket,
> >> >>
> >> >> Can you list down an example for this. It would be easier to
> >>understand
> >> >>:)
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Mayuresh
> >> >>
> >> >> On Thu, Mar 19, 2015 at 4:46 PM, Jiangjie Qin
> >> >><j...@linkedin.com.invalid>
> >> >> wrote:
> >> >>
> >> >> > Hi Jason,
> >> >> >
> >> >> > The round-robin strategy first takes the partitions of all the
> >>topics
> >> >>a
> >> >> > consumer is consuming from, then distributed them across all the
> >> >> consumers.
> >> >> > If different consumers are consuming from different topics, the
> >> >>assigning
> >> >> > algorithm will generate different answers on different consumers.
> >> >> > It is OK for consumers to have different thread count, but the
> >> >>consumers
> >> >> > have to consume from the same set of topics.
> >> >> >
> >> >> >
> >> >> > For range strategy, the balance is for each individual topic
> >>instead
> >> >>of
> >> >> > cross topics. So the balance is only done for the consumers
> >>consuming
> >> >> from
> >> >> > the same topic.
> >> >> >
> >> >> > Thanks.
> >> >> >
> >> >> > Jiangjie (Becket) Qin
> >> >> >
> >> >> > On 3/19/15, 4:14 PM, "Jason Rosenberg" <j...@squareup.com> wrote:
> >> >> >
> >> >> > >So,
> >> >> > >
> >> >> > >I've run into an issue migrating a consumer to use the new
> >> >>'roundrobin'
> >> >> > >partition.assignment.strategy.  It turns out that several of our
> >> >> consumers
> >> >> > >use the same group id, but instantiate several different consumer
> >> >> > >instances
> >> >> > >(with different topic selectors and thread counts).  Often, this
> >>is
> >> >>done
> >> >> > >in
> >> >> > >a single shared process.  It turns out this arrangement is not
> >> >>allowed
> >> >> > >when
> >> >> > >using the 'roundrobin' assignment strategy.
> >> >> > >
> >> >> > >I'm curious as to the reason for this restriction?  Why is it not
> >> >>also a
> >> >> > >restriction for the 'range' strategy (which we've been happily
> >>using
> >> >>for
> >> >> > >some time now)?
> >> >> > >
> >> >> > >It would seem that as long as you always assign a partition to a
> >> >> consumer
> >> >> > >instance that is actually selecting it, you should still be able
> >>to
> >> >> > >proceed
> >> >> > >with the round-robin algorithm (potentially skipping consumers if
> >> >>they
> >> >> > >can't select the next partition in the list, etc.).
> >> >> > >
> >> >> > >Jason
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >> --
> >> >> -Regards,
> >> >> Mayuresh R. Gharat
> >> >> (862) 250-7125
> >> >>
> >>
> >>
>
>

Reply via email to