Found this thread after posting an alternative idea after we starting hitting this issue ourselves for a job that has a lot of state stores and topic partitions. My suggestion was to have consumer groups have a configurable minimum member count before consumption begins, but that has its own trade offs and benefits (maybe a different KIP.)
One suggestion I had is maybe there is some relatively fool-proof heuristic that can cause Kafka Streams to emit an INFO/WARN to the log to inform the user of the configuration if it detects a rapid rebalance on startup due to new nodes joining? For example, if streams detects a rebalance, before processors are initialized, that only add new nodes, if the configuration has not been overridden, write to the log? On Thu, Jun 8, 2017 at 2:56 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Just recapping on client-side v.s. broker-side config: we did discuss about > adding this as a client-side config and bump up join-group request (I think > both Ismael and Ewen questioned about it) to include this configured value > to the broker. I cannot remember if there is any strong motivations against > going to the client-side config, except that we felt a default non-zero > value will benefit most users assuming they start with more than one member > in their group but only advanced users would really realize this config > existing and tune it themselves. > > I agree that we could re-consider it for the next release if we observe > that it is actually affecting more users than benefiting them. > > Guozhang > > On Wed, Jun 7, 2017 at 2:26 AM, Damian Guy <damian....@gmail.com> wrote: > > > Hi Jun/Ismael, > > > > Sounds good to me. > > > > Thanks, > > Damian > > > > On Tue, 6 Jun 2017 at 23:08 Ismael Juma <ism...@juma.me.uk> wrote: > > > > > Hi Jun, > > > > > > The console consumer issue also came up in a conversation I was having > > > recently. Seems like the config/server.properties change is a > reasonable > > > compromise given that we have other defaults that are for development. > > > > > > Ismael > > > > > > On Tue, Jun 6, 2017 at 10:59 PM, Jun Rao <j...@confluent.io> wrote: > > > > > > > Hi, Everyone, > > > > > > > > Sorry for being late on this thread. I just came across this thread. > I > > > have > > > > a couple of concerns on this. (1) It seems the amount of delay will > be > > > > application specific. So, it seems that it's better for the delay to > > be a > > > > client side config instead of a server side one? (2) When running > > console > > > > consumer in quickstart, a minimum of 3 sec delay seems to be a bad > > > > experience for our users. > > > > > > > > Since we are getting late into the release cycle, it may be a bit too > > > late > > > > to make big changes in the 0.11 release. Perhaps we should at least > > > > consider overriding the delay in config/server.properties to 0 to > > improve > > > > the quickstart experience? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > On Tue, Apr 11, 2017 at 12:19 AM, Damian Guy <damian....@gmail.com> > > > wrote: > > > > > > > > > Hi Onur, > > > > > > > > > > It was in my previous email. But here it is again. > > > > > > > > > > ============================================================ > > > > > > > > > > 1. Better rebalance timing. We will try to rebalance only when all > > the > > > > > consumers in a group have joined. The challenge would be someone > has > > to > > > > > define what does ALL consumers mean, it could either be a time or > > > number > > > > of > > > > > consumers, etc. > > > > > > > > > > 2. Avoid frequent rebalance. For example, if there are 100 > consumers > > > in a > > > > > group, today, in the worst case, we may end up with 100 rebalances > > even > > > > if > > > > > all the consumers joined the group in a reasonably small amount of > > > time. > > > > > Frequent rebalance is also a bad thing for brokers. > > > > > > > > > > Having a client side configuration may solve problem 1 better > because > > > > each > > > > > consumer group can potentially configure their own timing. However, > > it > > > > does > > > > > not really prevent frequent rebalance in general because some of > the > > > > > consumers can be misconfigured. (This may have something to do with > > > > KIP-124 > > > > > as well. But if quota is applied on the JoinGroup/SyncGroup request > > it > > > > may > > > > > cause some unwanted cascading effects.) > > > > > > > > > > Having a broker side configuration may result in less flexibility > for > > > > each > > > > > consumer group, but it can prevent frequent rebalance better. I > think > > > > with > > > > > some reasonable design, the rebalance timing issue can be resolved > on > > > the > > > > > broker side as well. Matthias had a good point on extending the > delay > > > > when > > > > > a new consumer joins a group (we actually did something similar to > > > batch > > > > > ISR change propagation). For example, let's say on the broker side, > > we > > > > will > > > > > always delay 2 seconds each time we see a new consumer joining a > > > consumer > > > > > group. This would probably work for most of the consumer groups and > > > will > > > > > also limit the rebalance frequency to protect the brokers. > > > > > > > > > > I am not sure about the streams use case here, but if something > like > > 2 > > > > > seconds of delay is acceptable for streams, I would prefer adding > the > > > > > configuration to the broker so that we can address both problems. > > > > > > > > > > On Thu, 6 Apr 2017 at 17:11 Onur Karaman < > > onurkaraman.apa...@gmail.com > > > > > > > > > wrote: > > > > > > > > > > > Hi Damian. > > > > > > > > > > > > Can you copy the point Becket made earlier that you say isn't > > > > addressed? > > > > > > > > > > > > On Thu, Apr 6, 2017 at 2:51 AM, Damian Guy <damian....@gmail.com > > > > > > wrote: > > > > > > > > > > > > > Thanks all, the Vote is now closed and the KIP has been > accepted > > > > with 9 > > > > > > +1s > > > > > > > > > > > > > > 3 binding:: > > > > > > > Guozhang, > > > > > > > Jason, > > > > > > > Ismael > > > > > > > > > > > > > > 6 non-binding: > > > > > > > Bill, > > > > > > > Eno, > > > > > > > Mathieu, > > > > > > > Matthias, > > > > > > > Dong, > > > > > > > Mickael > > > > > > > > > > > > > > Thanks, > > > > > > > Damian > > > > > > > > > > > > > > On Thu, 6 Apr 2017 at 09:26 Ismael Juma <ism...@juma.me.uk> > > wrote: > > > > > > > > > > > > > > > Thanks for the KIP, +1 (binding). > > > > > > > > > > > > > > > > Ismael > > > > > > > > > > > > > > > > On Thu, Mar 30, 2017 at 8:55 PM, Jason Gustafson < > > > > ja...@confluent.io > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > +1 Thanks for the KIP! > > > > > > > > > > > > > > > > > > On Thu, Mar 30, 2017 at 12:51 PM, Guozhang Wang < > > > > > wangg...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > > > Sorry about the previous email, Gmail seems be collapsing > > > them > > > > > > into a > > > > > > > > > > single thread on my inbox. > > > > > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > On Thu, Mar 30, 2017 at 11:34 AM, Guozhang Wang < > > > > > > wangg...@gmail.com> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Damian, could you create a new thread for the voting > > > process? > > > > > > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 30, 2017 at 10:33 AM, Bill Bejeck < > > > > > bbej...@gmail.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > >> +1(non-binding) > > > > > > > > > > >> > > > > > > > > > > >> On Thu, Mar 30, 2017 at 1:30 PM, Eno Thereska < > > > > > > > > eno.there...@gmail.com > > > > > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > > > >> > > > > > > > > > > >> > +1 (non binding) > > > > > > > > > > >> > > > > > > > > > > > >> > Thanks > > > > > > > > > > >> > Eno > > > > > > > > > > >> > > On 30 Mar 2017, at 18:01, Matthias J. Sax < > > > > > > > > matth...@confluent.io> > > > > > > > > > > >> wrote: > > > > > > > > > > >> > > > > > > > > > > > > >> > > +1 > > > > > > > > > > >> > > > > > > > > > > > > >> > > On 3/30/17 3:46 AM, Damian Guy wrote: > > > > > > > > > > >> > >> Hi All, > > > > > > > > > > >> > >> > > > > > > > > > > >> > >> I'd like to start the voting thread on KIP-134: > > > > > > > > > > >> > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > > > > > > > > >> > 134%3A+Delay+initial+consumer+group+rebalance > > > > > > > > > > >> > >> > > > > > > > > > > >> > >> Thanks, > > > > > > > > > > >> > >> Damian > > > > > > > > > > >> > >> > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > -- Guozhang >