We do have a jmx that reports the lag per partition. You could probably get
the lag that way. Then, you just need to slow down the iteration on the
fast partition.

Thanks,

Jun


On Mon, Jun 9, 2014 at 4:07 AM, Bogdan Dimitriu (bdimitri) <
bdimi...@cisco.com> wrote:

> Certainly.
> I know this may not sound like a great idea but I am running out of
> options here: I¹m basically trying to implement a consumer throttle. My
> application consumes from a fairly high number of partitions from a number
> of consumer servers. The data is put in the partitions by a producer in a
> round robin fashion so the number of messages each partition is given is
> even. The messages have a time component assigned to them.
> Now, for a good majority of time the consumers will be faster than the
> producers, so the lags I get with the ConsumerOffsetChecker are mostly 0
> (or 1) and this works well with the time component because once they are
> consumed there is a logical grouping of messages from all partitions based
> on the time component (coarsely).
> The point where I¹m starting to get into trouble is when the consumers are
> all stopped for a while and messages start to accumulate in the partitions
> without being consumed (hence the lag increases). Once I resume the
> consumers (with 1 thread per each partition) messages start to get
> consumed very fast, but because some messages take longer to process than
> others, over time the lags start to get very uneven between partitions and
> this starts to interfere with the grouping by the time component.
> So the only way I thought I could prevent this from happening was by
> throttling the ³fast² consumers, which have a smaller lag. This will only
> happen rarely, so I thought I could live with the approach. To do that, I
> obviously need to get the lag information periodically. I do that by a
> mechanism similar to what ConsumerOffsetChecker does. But now I need to
> know from which thread to call sleep() and there seems to be no decent way
> to find that out.
> Any better ideas would be highly appreciated.
>
> Thanks,
> Bogdan
>
> On 08/06/2014 06:25, "Jun Rao" <jun...@gmail.com> wrote:
>
> >Could you elaborate on the use case of the stream ID?
> >
> >Thanks,
> >
> >Jun
> >
> >
> >On Fri, Jun 6, 2014 at 2:13 AM, Bogdan Dimitriu (bdimitri) <
> >bdimi...@cisco.com> wrote:
> >
> >> Hello folks,
> >>
> >> I¹m using Kafka 0.8.0 with the high level consumer and I have a
> >>situation
> >> where I need to obtain the ID for each of the KafkaStreams that I
> >>create.
> >> The KafkaStream class has a method called ³clientId()² that I expected
> >> would give me just that, but unfortunately it returns the name of the
> >> consumer group.
> >> So to make it clear, what I want to obtain is the string that looks like
> >> this: myconsumergroup_myhost-1402045464004-2dc0cbf2-0.
> >> Is there any way I could get that value for each of the streams? I¹ve
> >> looked around the source code but I can¹t see any way to do this.
> >>
> >> Many thanks,
> >> Bogdan
> >>
> >>
>
>

Reply via email to