Re: Metrics package discussion

Jun Rao Mon, 30 Mar 2015 18:30:52 -0700

If we are committed to migrating the broker side metrics to KM for the next
release, we will need to (1) have a story on supporting common reporters
(as listed in KAFKA-1930), and (2) see if the current histogram support is
good enough for measuring things like request time.


Thanks,

Jun

On Mon, Mar 30, 2015 at 3:03 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> If we do plan to use the network code in client, I think that is a good
> reason in favor of migration. It will be unnecessary to have metrics from
> multiple libraries coexist since our users will have to start monitoring
> these new metrics anyway.
>
> I also agree with Jay that in multi-tenant clusters people care about
> detailed statistics for their own application over global numbers.
>
> Based on the arguments so far, I'm +1 for migrating to KM.
>
> Thanks,
> Aditya
>
> ________________________________________
> From: Jun Rao [j...@confluent.io]
> Sent: Sunday, March 29, 2015 9:44 AM
> To: dev@kafka.apache.org
> Subject: Re: Metrics package discussion
>
> There is another thing to consider. We plan to reuse the client components
> on the server side over time. For example, as part of the security work, we
> are looking into replacing the server side network code with the client
> network code (KAFKA-1928). However, the client network already has metrics
> based on KM.
>
> Thanks,
>
> Jun
>
> On Sat, Mar 28, 2015 at 1:34 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
> > I think Joel's summary is good.
> >
> > I'll add a few more points:
> >
> > As discussed memory matter a lot if we want to be able to give
> percentiles
> > at the client or topic level, in which case we will have thousands of
> them.
> > If we just do histograms at the global level then it is not a concern.
> The
> > argument for doing histograms at the client and topic level is that
> > averages are often very misleading, especially for latency information or
> > other asymmetric distributions. Most people who care about this kind of
> > thing would say the same. If you are a user of a multi-tenant cluster
> then
> > you probably care a lot more about stats for your application or your
> topic
> > rather than the global, so it could be nice to have histograms for
> these. I
> > don't feel super strongly about this.
> >
> > The ExponentiallyDecayingSample is internally
> > a ConcurrentSkipListMap<Double, Long>. This seems to have an overhead of
> > about 64 bytes per entry. So a 1000 element sample is 64KB. For global
> > metrics this is fine, but for granular metrics not workable.
> >
> > Two other issues I'm not sure about:
> >
> > 1. Is there a way to get metric descriptions into the coda hale JMX
> output?
> > One of the really nicest practical things about the new client metrics is
> > that if you look at them in jconsole each metric has an associated
> > description that explains what it means. I think this is a nice usability
> > thing--it is really hard to know what to make of the current metrics
> > without this kind of documentation and keeping separate docs up-to-date
> is
> > really hard and even if you do it most people won't find it.
> >
> > 2. I'm not clear if the sample decay in the histogram is actually the
> same
> > as for the other stats. It seems like it isn't but this would make
> > interpretation quite difficult. In other words if I have N metrics
> > including some Histograms some Meters, etc are all these measurements all
> > taken over the same time window? I actually think they are not, it looks
> > like there are different sampling methodologies across. So this means if
> > you have a dashboard that plots these things side by side the measurement
> > at a given point in time is not actually comparable across multiple
> stats.
> > Am I confused about this?
> >
> > -Jay
> >
> >
> > On Fri, Mar 27, 2015 at 6:27 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
> >
> > > For the samples: it will be at least double that estimate I think
> > > since the long array contains (eight byte) references to the actual
> > > longs, each of which also have some object overhead.
> > >
> > > Re: testing: actually, it looks like YM metrics does allow you to
> > > drop in your own clock:
> > >
> > >
> >
> https://github.com/dropwizard/metrics/blob/master/metrics-core/src/main/java/com/codahale/metrics/Clock.java
> > >
> > >
> >
> https://github.com/dropwizard/metrics/blob/master/metrics-core/src/main/java/com/codahale/metrics/Meter.java#L36
> > >
> > > Not sure if it was mentioned in this (or some recent) thread but a
> > > major motivation in the kafka-common metrics (KM) was absorbing API
> > > changes and even mbean naming conventions. For e.g., in the early
> > > stages of 0.8 we picked up YM metrics 3.x but collided with client
> > > apps at LinkedIn which were still on 2.x. We ended up changing our
> > > code to use 2.x in the end. Having our own metrics package makes us
> > > less vulnerable to these kinds of changes. The multiple version
> > > collision problem is obviously less of an issue with the broker but we
> > > are still exposed to possible metric changes in YM metrics.
> > >
> > > I'm wondering if we need to weigh too much toward the memory overheads
> > > of histograms in making a decision here simply because I don't think
> > > we have found them to be an extreme necessity for
> > > per-clientid/per-partition metrics and they are more critical for
> > > aggregate (global) metrics.
> > >
> > > So it seems the main benefits of switching to KM metrics are:
> > > - Less exposure to YM metrics changes
> > > - More control over the actual implementation. E.g., there is
> > >   considerable research on implementing approximate-but-good-enough
> > >   histograms/percentiles that we can try out
> > > - Differences (improvements) from YM metrics such as:
> > >   - hierarchical sensors
> > >   - integrated with quota enforcement
> > >   - mbeans can logically group attributes computed from different
> > >     sensors. So there is logical grouping (as opposed to a separate
> > >     mbean per sensor as is the case in YM metrics).
> > >
> > > The main disadvantages:
> > > - Everyone's graphs and alerts will break and need to be updated
> > > - Histogram support needs to be tested more/improved
> > >
> > > The first disadvantage is a big one but we aren't exactly immune to
> > > that if we stick with YM.
> > >
> > > BTW with KM metrics we should also provide reporters (graphite,
> > > ganglia) but we probably need to do this anyway since the new clients
> > > are on KM metrics.
> > >
> > > Thanks,
> > >
> > > Joel
> > >
> > > On Fri, Mar 27, 2015 at 06:48:48PM +0000, Aditya Auradkar wrote:
> > > > Adding to what Jay said.
> > > >
> > > > The library maintains 1k samples by default. The UniformSample has a
> > > long array so about 8k overhead per histogram. The
> > > ExponentiallyDecayingSample (which is what we use) has a 16 byte
> overhead
> > > per stored sample, so about 16k per histogram. So 10k histograms (worst
> > > case? metrics per partition and client) is about 160MB of memory in the
> > > broker.
> > > >
> > > > Copying is also a problem. For  percentiles on HistogramMBean, the
> > > implementation does a copy of the entire array. For e.g., if we called
> > > get50Percentile() and get75Percentile(), the entire array would get
> > copied
> > > twice which is pretty bad if we called each metric on every MBean.
> > > >
> > > > Another point Joel mentioned is that codahale metrics are harder to
> > > write tests against because we cannot pass in a Clock.
> > > >
> > > > IMO, if a library is preventing us from adding all the metrics that
> we
> > > want to add and we have a viable alternative, we should replace it. It
> > > might be short term pain but in the long run we will have more useful
> > > graphs.
> > > > What do people think? I can start a vote thread on this once we have
> a
> > > couple more opinions.
> > > >
> > > > Thanks,
> > > > Aditya
> > > > ________________________________________
> > > > From: Jay Kreps [jay.kr...@gmail.com]
> > > > Sent: Thursday, March 26, 2015 2:29 PM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: Metrics package discussion
> > > >
> > > > Yeah that is a good summary.
> > > >
> > > > The reason we don't use histograms heavily in the server is because
> of
> > > the
> > > > memory issues. We originally did use histograms for everything, then
> we
> > > ran
> > > > into all these issues, and ripped them out. Whether they are really
> > > useful
> > > > or not, I don't know. Averages can be pretty misleading so it can be
> > nice
> > > > but I don't know that it is critical.
> > > >
> > > > -Jay
> > > >
> > > > On Thu, Mar 26, 2015 at 1:58 PM, Aditya Auradkar <
> > > > aaurad...@linkedin.com.invalid> wrote:
> > > >
> > > > > From what I can tell, Histograms don't seem to be used extensively
> in
> > > the
> > > > > Kafka server (only in RequestChannel.scala) and I'm not sure we
> need
> > > them
> > > > > for per-client metrics. Topic metrics use meters currently.
> > Migrating
> > > > > graphing, alerting will be quite a significant effort for all users
> > of
> > > > > Kafka. Do the potential benefits of the new metrics package
> outweigh
> > > this
> > > > > one time migration? In the long run it seems nice to have a unified
> > > metrics
> > > > > package across clients and server. If we were starting out from
> > scratch
> > > > > without any existing deployments, what decision would we take?
> > > > >
> > > > > I suppose the relative effort in supporting is a useful data point
> in
> > > this
> > > > > discussion. We need to throttle based on the current byte rate
> which
> > > should
> > > > > be a "Meter" in codahale terms. The Meter implementation uses a 1,
> 5
> > > and 15
> > > > > minute exponential window moving average. The library also does not
> > > use the
> > > > > most recent samples of data for Metered metrics. For calculating
> > > rates, the
> > > > > EWMA class has a scheduled task that runs every 5 seconds and
> adjusts
> > > the
> > > > > rate using the new data accordingly. In that particular case, I
> think
> > > the
> > > > > new library is superior since it is more responsive.  If we do
> choose
> > > to
> > > > > remain with Yammer on the server, here are a few ideas on how to
> > > support
> > > > > quotas with relatively less effort.
> > > > >
> > > > > - We could have a new type of Meter called "QuotaMeter" that can
> wrap
> > > the
> > > > > existing meter code that follows the same pattern that the Sensor
> > does
> > > in
> > > > > the new metrics library. This QuotaMeter needs to be configured
> with
> > a
> > > > > Quota and it can have a finer grained rate than 1 minute (10
> seconds?
> > > > > configurable?). Anytime we call "mark()", it update the underlying
> > > rates
> > > > > and throw a QuotaViolationException if required. This class can
> > either
> > > > > extend Meter or be a separate implementation of the Metric
> superclass
> > > that
> > > > > every metric implements.
> > > > >
> > > > > - We can also consider implementing these quotas with the new
> metrics
> > > > > package and have these co-exist with the existing metrics. This
> leads
> > > to 2
> > > > > metric packages being used on the server, but they are both pulled
> in
> > > as
> > > > > dependencies anyway. Using this for metrics we can quota on may not
> > be
> > > a
> > > > > bad place to start.
> > > > >
> > > > > Thanks,
> > > > > Aditya
> > > > > ________________________________________
> > > > > From: Jay Kreps [jay.kr...@gmail.com]
> > > > > Sent: Wednesday, March 25, 2015 11:08 PM
> > > > > To: dev@kafka.apache.org
> > > > > Subject: Re: Metrics package discussion
> > > > >
> > > > > Here was my understanding of the issue last time.
> > > > >
> > > > > The yammer metrics use a random sample of requests to estimate the
> > > > > histogram. This allocates a fairly large array of longs (their
> values
> > > are
> > > > > longs rather than floats). A reasonable sample might be 8k entries
> > > which
> > > > > would give about 64KB per histogram. There are bounds on accuracy,
> > but
> > > they
> > > > > are only probabilistic. I.e. if you try to get 99% < 5 ms of
> > > inaccuracy,
> > > > > you will 1% of the time get more than this. This is okay but if you
> > > try to
> > > > > alert, in which you realize that being wrong 1% of the time is a
> lot
> > > if you
> > > > > are computing stats every second continuously on many metrics
> (i.e. 1
> > > in
> > > > > 100 estimates will be outside you bound). This array is copied in
> > full
> > > > > every time you check the metric which is the other cause of the
> > memory
> > > > > pressure.
> > > > >
> > > > > The better approach to histograms is to calculate buckets
> boundaries
> > > and
> > > > > record arbitrarily many values in those buckets. A simple bucketing
> > > > > approach for latency would be 0, 5ms, 10ms, 15ms, etc, and you just
> > > count
> > > > > how many fall in each bucket. Your precision is deterministically
> > > bounded
> > > > > by the bucket boundaries, so if you had 5ms buckets you would never
> > > have
> > > > > more than 5ms loss of precision. By using non-uniform bucket sizes
> > you
> > > can
> > > > > make this work even better (e.g. give ~1ms precision for latencies
> in
> > > the
> > > > > 1ms range, but give only 1 second precision for latencies in the 30
> > > second
> > > > > range). That is what is implemented in that metrics package.
> > > > >
> > > > > I think this bucketing approach is popular now. There is a whole
> "HDR
> > > > > histogram" library that gives lots of different bucketing methods
> and
> > > > > implements dynamic resizing so you don't have to specify an upper
> > > bound.
> > > > >  https://github.com/HdrHistogram/HdrHistogram
> > > > >
> > > > > Whether this matters depends entirely if you want histograms broken
> > > down at
> > > > > the client, topic, partition, or broker level or just want overall
> > > metrics.
> > > > > If we just want per sever aggregates for histograms then I think
> the
> > > memory
> > > > > usage is not a huge issue. If you want a histogram per topic or
> > client
> > > or
> > > > > partition and have 10k of these then that is where you start
> talking
> > > like
> > > > > 1GB of memory with the yammer package, which is what we hit last
> > time.
> > > > > Getting percentiles on the client level is nice, percentiles are
> > > definitely
> > > > > better than averages, but I'm not sure it is required.
> > > > >
> > > > > -Jay
> > > > >
> > > > > On Wed, Mar 25, 2015 at 9:43 PM, Neha Narkhede <n...@confluent.io>
> > > wrote:
> > > > >
> > > > > > Aditya,
> > > > > >
> > > > > > If we are doing a deep dive, one of the things to investigate
> would
> > > be
> > > > > > memory/GC performance. IIRC, when I was looking into codahale at
> > > > > LinkedIn,
> > > > > > I remember it having quite a few memory management and GC issues
> > > while
> > > > > > using histograms. In comparison, histograms in the new metrics
> > > package
> > > > > > aren't very well tested.
> > > > > >
> > > > > > Thanks,
> > > > > > Neha
> > > > > >
> > > > > > On Wed, Mar 25, 2015 at 8:25 AM, Aditya Auradkar <
> > > > > > aaurad...@linkedin.com.invalid> wrote:
> > > > > >
> > > > > > > Hey everyone,
> > > > > > >
> > > > > > > Picking up this discussion after yesterdays KIP hangout. For
> > > anyone who
> > > > > > > did not join the meeting, we have 2 different metrics packages
> > > being
> > > > > used
> > > > > > > by the clients (custom package) and the server (codahale). We
> are
> > > > > > > discussing whether to migrate the server to the new package.
> > > > > > >
> > > > > > > What information do we need in order to make a decision?
> > > > > > >
> > > > > > > Some pros of the new package:
> > > > > > > - Using the most recent information by combining data from
> > > previous and
> > > > > > > current samples. I'm not sure how codahale does this so I'll
> > > > > investigate.
> > > > > > > - We can quota on anything we measure. This is pretty cool IMO.
> > > I've
> > > > > > > investigate the feasibility of adding this feature in codahale.
> > > > > > > - Hierarchical metrics. For example: we can define a sensor for
> > > overall
> > > > > > > bytes-in/bytes-out and also per-client. Updating the client
> > sensor
> > > will
> > > > > > > cause the global byte rate sensor to get modified too.
> > > > > > >
> > > > > > > What are some of the issues with codahale? One previous
> > discussion
> > > > > > > mentions high memory usage but I don't have any experience with
> > it
> > > > > > myself.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Aditya
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Thanks,
> > > > > > Neha
> > > > > >
> > > > >
> > >
> > >
> >
>

Re: Metrics package discussion

Reply via email to