Re: consumer lag metric
Thanks Todd. that will work On Tue, Feb 17, 2015 at 10:31 PM, Todd Palino wrote: > In order to do that, you'll need to run it and parse the output, and then > emit it to your metrics system of choice. This is essentially what I do - I > have a monitoring application which runs every minute and pulls the offsets > for a select set of topics and consumers, and then packages up the metrics > and sends them to our internal system. > > It's not ideal. We're working on a script to calculate lag efficiently for > all consumers who commit offsets to Kafka, rather than a select set. > > -Todd > > > On Mon, Feb 16, 2015 at 12:27 AM, tao xiao wrote: > > > Thank you Todd for your detailed explanation. Currently I export all > > metrics to graphite using the reporter configuration. is there a way I > can > > do similar thing with offset checker? > > > > On Mon, Feb 16, 2015 at 4:21 PM, Todd Palino wrote: > > > > > The reason for this is the mechanic by which each of the lags are > > > calculated. MaxLag (and the FetcherLagMetric) are calculated by the > > > consumer itself using the difference between the offset it knows it is > > at, > > > and the offset that the broker has as the end of the partition. The > > offset > > > checker, however, uses the last offset that the consumer committed. > > > Depending on your configuration, this is somewhere behind where the > > > consumer actually is. For example, if your commit interval is set to 10 > > > minutes, the number used by the offset checker can be up to 10 minutes > > > behind where it actually is. > > > > > > So while MaxLag may be more up to date at any given time, it's actually > > > less accurate. Because MaxLag relies on the consumer to report it, if > the > > > consumer breaks, you will not see an accurate lag number. This is why > > when > > > we are checking consumer lag, we use an external process that uses the > > > committed consumer offsets. This allows us to catch a broken consumer, > as > > > well as an active consumer that is just falling behind. > > > > > > -Todd > > > > > > > > > On Fri, Feb 13, 2015 at 9:34 PM, tao xiao > wrote: > > > > > > > Thanks Joel. But I discover that both MaxLag and FetcherLagMetrics > are > > > > always > > > > much smaller than the lag shown in offset checker. any reason? > > > > > > > > On Sat, Feb 14, 2015 at 7:22 AM, Joel Koshy > > wrote: > > > > > > > > > There are FetcherLagMetrics that you can take a look at. However, > it > > > > > is probably easiest to just monitor MaxLag as that reports the > > maximum > > > > > of all the lag metrics. > > > > > > > > > > On Fri, Feb 13, 2015 at 05:03:28PM +0800, tao xiao wrote: > > > > > > Hi team, > > > > > > > > > > > > Is there a metric that shows the consumer lag of a particular > > > consumer > > > > > > group? similar to what offset checker provides > > > > > > > > > > > > -- > > > > > > Regards, > > > > > > Tao > > > > > > > > > > > > > > > > > > > > > > -- > > > > Regards, > > > > Tao > > > > > > > > > > > > > > > -- > > Regards, > > Tao > > > -- Regards, Tao
Re: consumer lag metric
In order to do that, you'll need to run it and parse the output, and then emit it to your metrics system of choice. This is essentially what I do - I have a monitoring application which runs every minute and pulls the offsets for a select set of topics and consumers, and then packages up the metrics and sends them to our internal system. It's not ideal. We're working on a script to calculate lag efficiently for all consumers who commit offsets to Kafka, rather than a select set. -Todd On Mon, Feb 16, 2015 at 12:27 AM, tao xiao wrote: > Thank you Todd for your detailed explanation. Currently I export all > metrics to graphite using the reporter configuration. is there a way I can > do similar thing with offset checker? > > On Mon, Feb 16, 2015 at 4:21 PM, Todd Palino wrote: > > > The reason for this is the mechanic by which each of the lags are > > calculated. MaxLag (and the FetcherLagMetric) are calculated by the > > consumer itself using the difference between the offset it knows it is > at, > > and the offset that the broker has as the end of the partition. The > offset > > checker, however, uses the last offset that the consumer committed. > > Depending on your configuration, this is somewhere behind where the > > consumer actually is. For example, if your commit interval is set to 10 > > minutes, the number used by the offset checker can be up to 10 minutes > > behind where it actually is. > > > > So while MaxLag may be more up to date at any given time, it's actually > > less accurate. Because MaxLag relies on the consumer to report it, if the > > consumer breaks, you will not see an accurate lag number. This is why > when > > we are checking consumer lag, we use an external process that uses the > > committed consumer offsets. This allows us to catch a broken consumer, as > > well as an active consumer that is just falling behind. > > > > -Todd > > > > > > On Fri, Feb 13, 2015 at 9:34 PM, tao xiao wrote: > > > > > Thanks Joel. But I discover that both MaxLag and FetcherLagMetrics are > > > always > > > much smaller than the lag shown in offset checker. any reason? > > > > > > On Sat, Feb 14, 2015 at 7:22 AM, Joel Koshy > wrote: > > > > > > > There are FetcherLagMetrics that you can take a look at. However, it > > > > is probably easiest to just monitor MaxLag as that reports the > maximum > > > > of all the lag metrics. > > > > > > > > On Fri, Feb 13, 2015 at 05:03:28PM +0800, tao xiao wrote: > > > > > Hi team, > > > > > > > > > > Is there a metric that shows the consumer lag of a particular > > consumer > > > > > group? similar to what offset checker provides > > > > > > > > > > -- > > > > > Regards, > > > > > Tao > > > > > > > > > > > > > > > > > -- > > > Regards, > > > Tao > > > > > > > > > -- > Regards, > Tao >
Re: consumer lag metric
Thank you Todd for your detailed explanation. Currently I export all metrics to graphite using the reporter configuration. is there a way I can do similar thing with offset checker? On Mon, Feb 16, 2015 at 4:21 PM, Todd Palino wrote: > The reason for this is the mechanic by which each of the lags are > calculated. MaxLag (and the FetcherLagMetric) are calculated by the > consumer itself using the difference between the offset it knows it is at, > and the offset that the broker has as the end of the partition. The offset > checker, however, uses the last offset that the consumer committed. > Depending on your configuration, this is somewhere behind where the > consumer actually is. For example, if your commit interval is set to 10 > minutes, the number used by the offset checker can be up to 10 minutes > behind where it actually is. > > So while MaxLag may be more up to date at any given time, it's actually > less accurate. Because MaxLag relies on the consumer to report it, if the > consumer breaks, you will not see an accurate lag number. This is why when > we are checking consumer lag, we use an external process that uses the > committed consumer offsets. This allows us to catch a broken consumer, as > well as an active consumer that is just falling behind. > > -Todd > > > On Fri, Feb 13, 2015 at 9:34 PM, tao xiao wrote: > > > Thanks Joel. But I discover that both MaxLag and FetcherLagMetrics are > > always > > much smaller than the lag shown in offset checker. any reason? > > > > On Sat, Feb 14, 2015 at 7:22 AM, Joel Koshy wrote: > > > > > There are FetcherLagMetrics that you can take a look at. However, it > > > is probably easiest to just monitor MaxLag as that reports the maximum > > > of all the lag metrics. > > > > > > On Fri, Feb 13, 2015 at 05:03:28PM +0800, tao xiao wrote: > > > > Hi team, > > > > > > > > Is there a metric that shows the consumer lag of a particular > consumer > > > > group? similar to what offset checker provides > > > > > > > > -- > > > > Regards, > > > > Tao > > > > > > > > > > > > -- > > Regards, > > Tao > > > -- Regards, Tao
Re: consumer lag metric
The reason for this is the mechanic by which each of the lags are calculated. MaxLag (and the FetcherLagMetric) are calculated by the consumer itself using the difference between the offset it knows it is at, and the offset that the broker has as the end of the partition. The offset checker, however, uses the last offset that the consumer committed. Depending on your configuration, this is somewhere behind where the consumer actually is. For example, if your commit interval is set to 10 minutes, the number used by the offset checker can be up to 10 minutes behind where it actually is. So while MaxLag may be more up to date at any given time, it's actually less accurate. Because MaxLag relies on the consumer to report it, if the consumer breaks, you will not see an accurate lag number. This is why when we are checking consumer lag, we use an external process that uses the committed consumer offsets. This allows us to catch a broken consumer, as well as an active consumer that is just falling behind. -Todd On Fri, Feb 13, 2015 at 9:34 PM, tao xiao wrote: > Thanks Joel. But I discover that both MaxLag and FetcherLagMetrics are > always > much smaller than the lag shown in offset checker. any reason? > > On Sat, Feb 14, 2015 at 7:22 AM, Joel Koshy wrote: > > > There are FetcherLagMetrics that you can take a look at. However, it > > is probably easiest to just monitor MaxLag as that reports the maximum > > of all the lag metrics. > > > > On Fri, Feb 13, 2015 at 05:03:28PM +0800, tao xiao wrote: > > > Hi team, > > > > > > Is there a metric that shows the consumer lag of a particular consumer > > > group? similar to what offset checker provides > > > > > > -- > > > Regards, > > > Tao > > > > > > > -- > Regards, > Tao >
Re: consumer lag metric
Thanks Joel. But I discover that both MaxLag and FetcherLagMetrics are always much smaller than the lag shown in offset checker. any reason? On Sat, Feb 14, 2015 at 7:22 AM, Joel Koshy wrote: > There are FetcherLagMetrics that you can take a look at. However, it > is probably easiest to just monitor MaxLag as that reports the maximum > of all the lag metrics. > > On Fri, Feb 13, 2015 at 05:03:28PM +0800, tao xiao wrote: > > Hi team, > > > > Is there a metric that shows the consumer lag of a particular consumer > > group? similar to what offset checker provides > > > > -- > > Regards, > > Tao > > -- Regards, Tao
Re: consumer lag metric
There are FetcherLagMetrics that you can take a look at. However, it is probably easiest to just monitor MaxLag as that reports the maximum of all the lag metrics. On Fri, Feb 13, 2015 at 05:03:28PM +0800, tao xiao wrote: > Hi team, > > Is there a metric that shows the consumer lag of a particular consumer > group? similar to what offset checker provides > > -- > Regards, > Tao
consumer lag metric
Hi team, Is there a metric that shows the consumer lag of a particular consumer group? similar to what offset checker provides -- Regards, Tao