For “first partition”, I was speaking specifically of your example - Burrow doesn’t care about partition 0 vs. any other partition. Looking at that output from the groups tool, it looks like there are a lot of partition with no committed offsets. There’s even one partition with a committed offset past the log end offset, which is concerning. My guess here is that after you started Burrow, there were no offset commits until after that message was written to partition 1. After that, there was an offset commit which allowed Burrow to discover the consumer group.
One of the things I want to do is to have Burrow bootstrap the __consumer_offsets topic from the oldest offsets, which should avoid some confusion like this. However, there’s a couple things with higher priority for me personally first. -Todd On Fri, Jul 8, 2016 at 9:22 AM, Tom Dearman <tom.dear...@gmail.com> wrote: > Sorry, I should say only partition 1 had something at first, then zero: > > Toms-iMac:betwave-server tomdearman$ > /Users/tomdearman/software/kafka_2.11-0.10.0.0/bin/kafka-consumer-groups.sh > --new-consumer --bootstrap-server localhost:9092 --describe --group > voidbridge-oneworks-dummy > GROUP TOPIC PARTITION > CURRENT-OFFSET LOG-END-OFFSET LAG OWNER > voidbridge-oneworks-dummy integration-oneworks-dummy 2 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-3_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 7 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-3_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 12 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-3_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 17 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-3_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 4 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-5_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 9 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-5_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 14 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-5_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 19 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-5_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 1 > 3 3 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-2_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 6 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-2_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 11 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-2_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 16 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-2_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 3 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-4_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 8 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-4_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 13 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-4_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 18 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-4_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 0 > 10 0 -10 > integration-oneworks-dummy-voidbridge-oneworks-dummy-1_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 5 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-1_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 10 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-1_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 15 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-1_/10.100.0.113 > Toms-iMac:betwave-server tomdearman$ > /Users/tomdearman/software/kafka_2.11-0.10.0.0/bin/kafka-consumer-groups.sh > --new-consumer --bootstrap-server localhost:9092 --describe --group > voidbridge-oneworks-dummy > GROUP TOPIC PARTITION > CURRENT-OFFSET LOG-END-OFFSET LAG OWNER > voidbridge-oneworks-dummy integration-oneworks-dummy 2 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-3_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 7 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-3_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 12 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-3_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 17 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-3_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 4 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-5_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 9 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-5_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 14 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-5_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 19 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-5_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 1 > 3 3 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-2_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 6 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-2_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 11 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-2_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 16 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-2_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 3 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-4_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 8 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-4_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 13 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-4_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 18 > unknown 0 unknown > integration-oneworks-dummy-voidbridge-oneworks-dummy-4_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 0 > 1 1 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-1_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 5 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-1_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 10 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-1_/10.100.0.113 > voidbridge-oneworks-dummy integration-oneworks-dummy 15 > 0 0 0 > integration-oneworks-dummy-voidbridge-oneworks-dummy-1_/10.100.0.113 > > > On 8 Jul 2016, at 17:20, Tom Dearman <tom.dear...@gmail.com> wrote: > > > > When you say ‘for the first partition’ do you literally mean partition > zero, or you mean any partition. It is true that when I had only 1 user > there were only messages on partition 15 but the second user happened to go > to partition zero. Is it the case that partition zero must have a consumer > commit? > > > >> On 8 Jul 2016, at 17:16, Todd Palino <tpal...@gmail.com> wrote: > >> > >> If you open up an issue on the project, I'd be happy to dig into this in > >> more detail if needed. Excluding the ZK offset checking, Burrow doesn't > >> enumerate consumer groups - it learns about them from offset commits. It > >> sounds like maybe your consumer had not committed offsets for the first > >> partition (at least not after Burrow was started). > >> > >> -Todd > >> > >> On Friday, July 8, 2016, Tom Dearman <tom.dear...@gmail.com> wrote: > >> > >>> Todd, > >>> > >>> Thanks for that I am taking a look. > >>> > >>> Is there a bug whereby if you only have a couple of messages on a > topic, > >>> both with the same key, that burrow doesn’t return correct info. I was > >>> finding that http://localhost:8100/v2/kafka/betwave/consumer < > >>> http://localhost:8100/v2/kafka/betwave/consumer> was returning a > message > >>> with empty consumers until I put on another message with a different > key, > >>> i.e. a minimum of 2 partitions with something in them. I know this is > not > >>> very like production, but on my local this I was only testing with one > user > >>> so get just one partition filled. > >>> > >>> Tom > >>>> On 6 Jul 2016, at 18:08, Todd Palino <tpal...@gmail.com > <javascript:;>> > >>> wrote: > >>>> > >>>> Yeah, I've written dissertations at this point on why MaxLag is > flawed. > >>> We > >>>> also used to use the offset checker tool, and later something similar > >>> that > >>>> was a little easier to slot into our monitoring systems. Problems with > >>> all > >>>> of these is why I wrote Burrow (https://github.com/linkedin/Burrow) > >>>> > >>>> For more details, you can also check out my blog post on the release: > >>>> > >>> > https://engineering.linkedin.com/apache-kafka/burrow-kafka-consumer-monitoring-reinvented > >>>> > >>>> -Todd > >>>> > >>>> On Wednesday, July 6, 2016, Tom Dearman <tom.dear...@gmail.com > >>> <javascript:;>> wrote: > >>>> > >>>>> I recently had a problem on my production which I believe was a > >>>>> manifestation of the issue kafka-2978 (Topic partition is not > sometimes > >>>>> consumed after rebalancing of consumer group), this is fixed in > 0.9.0.1 > >>> and > >>>>> we will upgrade our client soon. However, it made me realise that I > >>> didn’t > >>>>> have any monitoring set up on this. The only thing I can find as a > >>> metric > >>>>> is the > >>>>> > >>> > kafka.consumer:type=ConsumerFetcherManager,name=MaxLag,clientId=([-.\w]+), > >>>>> which, if I understand correctly, is the max lag of any partition > that > >>> that > >>>>> particular consumer is consuming. > >>>>> 1. If I had been monitoring this, and if my consumer was suffering > from > >>>>> the issue in kafka-2978, would I actually have been alerted, i.e. > since > >>> the > >>>>> consumer would think it is consuming correctly would it not have > updated > >>>>> the metric. > >>>>> 2. There is another way to see offset lag using the command > >>>>> /usr/bin/kafka-consumer-groups --new-consumer --bootstrap-server > >>>>> 10.10.1.61:9092 --describe —group consumer_group_name and parsing > the > >>>>> response. Is it safe or advisable to do this? I like the fact that > it > >>>>> tells me each partition lag, although it is also not available if no > >>>>> consumer from the group is currently consuming. > >>>>> 3. Is there a better way of doing this? > >>>> > >>>> > >>>> > >>>> -- > >>>> *Todd Palino* > >>>> Staff Site Reliability Engineer > >>>> Data Infrastructure Streaming > >>>> > >>>> > >>>> > >>>> linkedin.com/in/toddpalino > >>> > >>> > >> > >> -- > >> *Todd Palino* > >> Staff Site Reliability Engineer > >> Data Infrastructure Streaming > >> > >> > >> > >> linkedin.com/in/toddpalino > > > > -- *Todd Palino* Staff Site Reliability Engineer Data Infrastructure Streaming linkedin.com/in/toddpalino