It does seem like we are in a similar situation described in the KIP (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance).
For some historical reason we had set our session.timeout.ms to a high
value (5 minutes) which corresponds with the amount of the group would
wait. Lowering helps new consumers join faster but now I'm seeing group
fluctuations so I'll have to continue to dig into why thats happening. Also
worth noting the broker's rebalance timeout seems to switch to
max.poll.interval.ms in 0.10.1.

On Wed, Jun 14, 2017 at 10:53 AM Bryan Baugher <bjb...@gmail.com> wrote:

> While I do have some logs its not trivial to share since the logs are
> across 16 JVMs and a few different hosts.
>
> On Wed, Jun 14, 2017 at 10:34 AM Eno Thereska <eno.there...@gmail.com>
> wrote:
>
>> The delay in that KIP is just 3 seconds, not minutes though, right? Would
>> you have any logs to share?
>>
>> Thanks
>> Eno
>> > On 14 Jun 2017, at 16:14, Bryan Baugher <bjb...@gmail.com> wrote:
>> >
>> > Our consumer group isn't doing anything stateful and we've seen this
>> > behavior for existing groups as well. It seems like timing could be an
>> > issue, thanks for the information.
>> >
>> > On Tue, Jun 13, 2017 at 7:39 PM James Cheng <wushuja...@gmail.com>
>> wrote:
>> >
>> >> Bryan,
>> >>
>> >> This sounds related to
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance
>> >> and https://issues.apache.org/jira/browse/KAFKA-4925.
>> >>
>> >> -James
>> >>
>> >>> On Jun 13, 2017, at 7:02 AM, Bryan Baugher <bjb...@gmail.com> wrote:
>> >>>
>> >>> The topics already exist prior to starting any of the consumers
>> >>>
>> >>> On Mon, Jun 12, 2017 at 9:35 PM J Pai <jai.forums2...@gmail.com>
>> wrote:
>> >>>
>> >>>> When are the topics on which these consumer groups consume, created?
>> >>>>
>> >>>> -Jaikiran
>> >>>> On 13-Jun-2017, at 3:18 AM, Bryan Baugher <bjb...@gmail.com> wrote:
>> >>>>
>> >>>> Hi everyone,
>> >>>>
>> >>>> We are currently experiencing slow startup times for our consumer
>> groups
>> >>>> (16-32 processes for a hundred or more partitions) in the range of
>> >> minutes
>> >>>> (3-15 minutes), where little to no messages are consumed before
>> suddenly
>> >>>> everything just starts working at full speed.
>> >>>>
>> >>>> I'm currently using Kafka 0.9.0.1 but we are in the middle of
>> upgrading
>> >> to
>> >>>> Kafka 0.10.2.1. We also using the newer kafka consumer API and group
>> >>>> management on a simple Apache Storm topology. We don't make use of
>> >> Storm's
>> >>>> kafka spout but instead wrote a simple one ourselves.
>> >>>>
>> >>>> Using the kafka AdminClient I was able to poll for consumer group
>> >> summary
>> >>>> information. What I've found is that the group seems to sit
>> >>>> in PreparingRebalance state for minutes before finally becoming
>> Stable
>> >>>> which then everything starts processing quickly. I've also enabled
>> debug
>> >>>> logging around the consumer's coordinator classes but didn't see
>> >> anything
>> >>>> to indicate the issue.
>> >>>>
>> >>>> I'm hoping that just upgrading to 0.10 or tweaking how we use our
>> >> consumer
>> >>>> in Apache Storm is the problem but are there any pointers on things I
>> >>>> should look at or try?
>> >>>>
>> >>>> Bryan
>> >>>>
>> >>>>
>> >>
>> >>
>>
>>

Reply via email to