Re: Metrics package discussion

Gwen Shapira Mon, 30 Mar 2015 21:28:59 -0700

(1) It will be interesting to see what others use for monitoring
integration, to see what is already covered with existing JMX
integrations and what needs special support.


(2) I think the migration story is more important - this is a
non-compatible change, right? So we can't do it in 0.8.3 timeframe, it
has to be in 0.9? And we need to figure out how will users migrate -
do we just tell everyone "please reconfigure all your monitors from
scratch - don't worry, it is worth it?"
I know you keep saying we did it before and our users are used to it,
but I think there are a lot more users now, and some of them have
different compatibility expectations. We probably need to find:
* A least painful way to migrate - can we keep the names of at least
most of the metrics intact?
* Good explanation of what users gain from this painful migration
(i.e. more accurate statistics due to gazillion histograms)






On Mon, Mar 30, 2015 at 6:29 PM, Jun Rao <j...@confluent.io> wrote:
> If we are committed to migrating the broker side metrics to KM for the next
> release, we will need to (1) have a story on supporting common reporters
> (as listed in KAFKA-1930), and (2) see if the current histogram support is
> good enough for measuring things like request time.
>
> Thanks,
>
> Jun
>
> On Mon, Mar 30, 2015 at 3:03 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
>> If we do plan to use the network code in client, I think that is a good
>> reason in favor of migration. It will be unnecessary to have metrics from
>> multiple libraries coexist since our users will have to start monitoring
>> these new metrics anyway.
>>
>> I also agree with Jay that in multi-tenant clusters people care about
>> detailed statistics for their own application over global numbers.
>>
>> Based on the arguments so far, I'm +1 for migrating to KM.
>>
>> Thanks,
>> Aditya
>>
>> ________________________________________
>> From: Jun Rao [j...@confluent.io]
>> Sent: Sunday, March 29, 2015 9:44 AM
>> To: dev@kafka.apache.org
>> Subject: Re: Metrics package discussion
>>
>> There is another thing to consider. We plan to reuse the client components
>> on the server side over time. For example, as part of the security work, we
>> are looking into replacing the server side network code with the client
>> network code (KAFKA-1928). However, the client network already has metrics
>> based on KM.
>>
>> Thanks,
>>
>> Jun
>>
>> On Sat, Mar 28, 2015 at 1:34 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>>
>> > I think Joel's summary is good.
>> >
>> > I'll add a few more points:
>> >
>> > As discussed memory matter a lot if we want to be able to give
>> percentiles
>> > at the client or topic level, in which case we will have thousands of
>> them.
>> > If we just do histograms at the global level then it is not a concern.
>> The
>> > argument for doing histograms at the client and topic level is that
>> > averages are often very misleading, especially for latency information or
>> > other asymmetric distributions. Most people who care about this kind of
>> > thing would say the same. If you are a user of a multi-tenant cluster
>> then
>> > you probably care a lot more about stats for your application or your
>> topic
>> > rather than the global, so it could be nice to have histograms for
>> these. I
>> > don't feel super strongly about this.
>> >
>> > The ExponentiallyDecayingSample is internally
>> > a ConcurrentSkipListMap<Double, Long>. This seems to have an overhead of
>> > about 64 bytes per entry. So a 1000 element sample is 64KB. For global
>> > metrics this is fine, but for granular metrics not workable.
>> >
>> > Two other issues I'm not sure about:
>> >
>> > 1. Is there a way to get metric descriptions into the coda hale JMX
>> output?
>> > One of the really nicest practical things about the new client metrics is
>> > that if you look at them in jconsole each metric has an associated
>> > description that explains what it means. I think this is a nice usability
>> > thing--it is really hard to know what to make of the current metrics
>> > without this kind of documentation and keeping separate docs up-to-date
>> is
>> > really hard and even if you do it most people won't find it.
>> >
>> > 2. I'm not clear if the sample decay in the histogram is actually the
>> same
>> > as for the other stats. It seems like it isn't but this would make
>> > interpretation quite difficult. In other words if I have N metrics
>> > including some Histograms some Meters, etc are all these measurements all
>> > taken over the same time window? I actually think they are not, it looks
>> > like there are different sampling methodologies across. So this means if
>> > you have a dashboard that plots these things side by side the measurement
>> > at a given point in time is not actually comparable across multiple
>> stats.
>> > Am I confused about this?
>> >
>> > -Jay
>> >
>> >
>> > On Fri, Mar 27, 2015 at 6:27 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
>> >
>> > > For the samples: it will be at least double that estimate I think
>> > > since the long array contains (eight byte) references to the actual
>> > > longs, each of which also have some object overhead.
>> > >
>> > > Re: testing: actually, it looks like YM metrics does allow you to
>> > > drop in your own clock:
>> > >
>> > >
>> >
>> https://github.com/dropwizard/metrics/blob/master/metrics-core/src/main/java/com/codahale/metrics/Clock.java
>> > >
>> > >
>> >
>> https://github.com/dropwizard/metrics/blob/master/metrics-core/src/main/java/com/codahale/metrics/Meter.java#L36
>> > >
>> > > Not sure if it was mentioned in this (or some recent) thread but a
>> > > major motivation in the kafka-common metrics (KM) was absorbing API
>> > > changes and even mbean naming conventions. For e.g., in the early
>> > > stages of 0.8 we picked up YM metrics 3.x but collided with client
>> > > apps at LinkedIn which were still on 2.x. We ended up changing our
>> > > code to use 2.x in the end. Having our own metrics package makes us
>> > > less vulnerable to these kinds of changes. The multiple version
>> > > collision problem is obviously less of an issue with the broker but we
>> > > are still exposed to possible metric changes in YM metrics.
>> > >
>> > > I'm wondering if we need to weigh too much toward the memory overheads
>> > > of histograms in making a decision here simply because I don't think
>> > > we have found them to be an extreme necessity for
>> > > per-clientid/per-partition metrics and they are more critical for
>> > > aggregate (global) metrics.
>> > >
>> > > So it seems the main benefits of switching to KM metrics are:
>> > > - Less exposure to YM metrics changes
>> > > - More control over the actual implementation. E.g., there is
>> > >   considerable research on implementing approximate-but-good-enough
>> > >   histograms/percentiles that we can try out
>> > > - Differences (improvements) from YM metrics such as:
>> > >   - hierarchical sensors
>> > >   - integrated with quota enforcement
>> > >   - mbeans can logically group attributes computed from different
>> > >     sensors. So there is logical grouping (as opposed to a separate
>> > >     mbean per sensor as is the case in YM metrics).
>> > >
>> > > The main disadvantages:
>> > > - Everyone's graphs and alerts will break and need to be updated
>> > > - Histogram support needs to be tested more/improved
>> > >
>> > > The first disadvantage is a big one but we aren't exactly immune to
>> > > that if we stick with YM.
>> > >
>> > > BTW with KM metrics we should also provide reporters (graphite,
>> > > ganglia) but we probably need to do this anyway since the new clients
>> > > are on KM metrics.
>> > >
>> > > Thanks,
>> > >
>> > > Joel
>> > >
>> > > On Fri, Mar 27, 2015 at 06:48:48PM +0000, Aditya Auradkar wrote:
>> > > > Adding to what Jay said.
>> > > >
>> > > > The library maintains 1k samples by default. The UniformSample has a
>> > > long array so about 8k overhead per histogram. The
>> > > ExponentiallyDecayingSample (which is what we use) has a 16 byte
>> overhead
>> > > per stored sample, so about 16k per histogram. So 10k histograms (worst
>> > > case? metrics per partition and client) is about 160MB of memory in the
>> > > broker.
>> > > >
>> > > > Copying is also a problem. For  percentiles on HistogramMBean, the
>> > > implementation does a copy of the entire array. For e.g., if we called
>> > > get50Percentile() and get75Percentile(), the entire array would get
>> > copied
>> > > twice which is pretty bad if we called each metric on every MBean.
>> > > >
>> > > > Another point Joel mentioned is that codahale metrics are harder to
>> > > write tests against because we cannot pass in a Clock.
>> > > >
>> > > > IMO, if a library is preventing us from adding all the metrics that
>> we
>> > > want to add and we have a viable alternative, we should replace it. It
>> > > might be short term pain but in the long run we will have more useful
>> > > graphs.
>> > > > What do people think? I can start a vote thread on this once we have
>> a
>> > > couple more opinions.
>> > > >
>> > > > Thanks,
>> > > > Aditya
>> > > > ________________________________________
>> > > > From: Jay Kreps [jay.kr...@gmail.com]
>> > > > Sent: Thursday, March 26, 2015 2:29 PM
>> > > > To: dev@kafka.apache.org
>> > > > Subject: Re: Metrics package discussion
>> > > >
>> > > > Yeah that is a good summary.
>> > > >
>> > > > The reason we don't use histograms heavily in the server is because
>> of
>> > > the
>> > > > memory issues. We originally did use histograms for everything, then
>> we
>> > > ran
>> > > > into all these issues, and ripped them out. Whether they are really
>> > > useful
>> > > > or not, I don't know. Averages can be pretty misleading so it can be
>> > nice
>> > > > but I don't know that it is critical.
>> > > >
>> > > > -Jay
>> > > >
>> > > > On Thu, Mar 26, 2015 at 1:58 PM, Aditya Auradkar <
>> > > > aaurad...@linkedin.com.invalid> wrote:
>> > > >
>> > > > > From what I can tell, Histograms don't seem to be used extensively
>> in
>> > > the
>> > > > > Kafka server (only in RequestChannel.scala) and I'm not sure we
>> need
>> > > them
>> > > > > for per-client metrics. Topic metrics use meters currently.
>> > Migrating
>> > > > > graphing, alerting will be quite a significant effort for all users
>> > of
>> > > > > Kafka. Do the potential benefits of the new metrics package
>> outweigh
>> > > this
>> > > > > one time migration? In the long run it seems nice to have a unified
>> > > metrics
>> > > > > package across clients and server. If we were starting out from
>> > scratch
>> > > > > without any existing deployments, what decision would we take?
>> > > > >
>> > > > > I suppose the relative effort in supporting is a useful data point
>> in
>> > > this
>> > > > > discussion. We need to throttle based on the current byte rate
>> which
>> > > should
>> > > > > be a "Meter" in codahale terms. The Meter implementation uses a 1,
>> 5
>> > > and 15
>> > > > > minute exponential window moving average. The library also does not
>> > > use the
>> > > > > most recent samples of data for Metered metrics. For calculating
>> > > rates, the
>> > > > > EWMA class has a scheduled task that runs every 5 seconds and
>> adjusts
>> > > the
>> > > > > rate using the new data accordingly. In that particular case, I
>> think
>> > > the
>> > > > > new library is superior since it is more responsive.  If we do
>> choose
>> > > to
>> > > > > remain with Yammer on the server, here are a few ideas on how to
>> > > support
>> > > > > quotas with relatively less effort.
>> > > > >
>> > > > > - We could have a new type of Meter called "QuotaMeter" that can
>> wrap
>> > > the
>> > > > > existing meter code that follows the same pattern that the Sensor
>> > does
>> > > in
>> > > > > the new metrics library. This QuotaMeter needs to be configured
>> with
>> > a
>> > > > > Quota and it can have a finer grained rate than 1 minute (10
>> seconds?
>> > > > > configurable?). Anytime we call "mark()", it update the underlying
>> > > rates
>> > > > > and throw a QuotaViolationException if required. This class can
>> > either
>> > > > > extend Meter or be a separate implementation of the Metric
>> superclass
>> > > that
>> > > > > every metric implements.
>> > > > >
>> > > > > - We can also consider implementing these quotas with the new
>> metrics
>> > > > > package and have these co-exist with the existing metrics. This
>> leads
>> > > to 2
>> > > > > metric packages being used on the server, but they are both pulled
>> in
>> > > as
>> > > > > dependencies anyway. Using this for metrics we can quota on may not
>> > be
>> > > a
>> > > > > bad place to start.
>> > > > >
>> > > > > Thanks,
>> > > > > Aditya
>> > > > > ________________________________________
>> > > > > From: Jay Kreps [jay.kr...@gmail.com]
>> > > > > Sent: Wednesday, March 25, 2015 11:08 PM
>> > > > > To: dev@kafka.apache.org
>> > > > > Subject: Re: Metrics package discussion
>> > > > >
>> > > > > Here was my understanding of the issue last time.
>> > > > >
>> > > > > The yammer metrics use a random sample of requests to estimate the
>> > > > > histogram. This allocates a fairly large array of longs (their
>> values
>> > > are
>> > > > > longs rather than floats). A reasonable sample might be 8k entries
>> > > which
>> > > > > would give about 64KB per histogram. There are bounds on accuracy,
>> > but
>> > > they
>> > > > > are only probabilistic. I.e. if you try to get 99% < 5 ms of
>> > > inaccuracy,
>> > > > > you will 1% of the time get more than this. This is okay but if you
>> > > try to
>> > > > > alert, in which you realize that being wrong 1% of the time is a
>> lot
>> > > if you
>> > > > > are computing stats every second continuously on many metrics
>> (i.e. 1
>> > > in
>> > > > > 100 estimates will be outside you bound). This array is copied in
>> > full
>> > > > > every time you check the metric which is the other cause of the
>> > memory
>> > > > > pressure.
>> > > > >
>> > > > > The better approach to histograms is to calculate buckets
>> boundaries
>> > > and
>> > > > > record arbitrarily many values in those buckets. A simple bucketing
>> > > > > approach for latency would be 0, 5ms, 10ms, 15ms, etc, and you just
>> > > count
>> > > > > how many fall in each bucket. Your precision is deterministically
>> > > bounded
>> > > > > by the bucket boundaries, so if you had 5ms buckets you would never
>> > > have
>> > > > > more than 5ms loss of precision. By using non-uniform bucket sizes
>> > you
>> > > can
>> > > > > make this work even better (e.g. give ~1ms precision for latencies
>> in
>> > > the
>> > > > > 1ms range, but give only 1 second precision for latencies in the 30
>> > > second
>> > > > > range). That is what is implemented in that metrics package.
>> > > > >
>> > > > > I think this bucketing approach is popular now. There is a whole
>> "HDR
>> > > > > histogram" library that gives lots of different bucketing methods
>> and
>> > > > > implements dynamic resizing so you don't have to specify an upper
>> > > bound.
>> > > > >  https://github.com/HdrHistogram/HdrHistogram
>> > > > >
>> > > > > Whether this matters depends entirely if you want histograms broken
>> > > down at
>> > > > > the client, topic, partition, or broker level or just want overall
>> > > metrics.
>> > > > > If we just want per sever aggregates for histograms then I think
>> the
>> > > memory
>> > > > > usage is not a huge issue. If you want a histogram per topic or
>> > client
>> > > or
>> > > > > partition and have 10k of these then that is where you start
>> talking
>> > > like
>> > > > > 1GB of memory with the yammer package, which is what we hit last
>> > time.
>> > > > > Getting percentiles on the client level is nice, percentiles are
>> > > definitely
>> > > > > better than averages, but I'm not sure it is required.
>> > > > >
>> > > > > -Jay
>> > > > >
>> > > > > On Wed, Mar 25, 2015 at 9:43 PM, Neha Narkhede <n...@confluent.io>
>> > > wrote:
>> > > > >
>> > > > > > Aditya,
>> > > > > >
>> > > > > > If we are doing a deep dive, one of the things to investigate
>> would
>> > > be
>> > > > > > memory/GC performance. IIRC, when I was looking into codahale at
>> > > > > LinkedIn,
>> > > > > > I remember it having quite a few memory management and GC issues
>> > > while
>> > > > > > using histograms. In comparison, histograms in the new metrics
>> > > package
>> > > > > > aren't very well tested.
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Neha
>> > > > > >
>> > > > > > On Wed, Mar 25, 2015 at 8:25 AM, Aditya Auradkar <
>> > > > > > aaurad...@linkedin.com.invalid> wrote:
>> > > > > >
>> > > > > > > Hey everyone,
>> > > > > > >
>> > > > > > > Picking up this discussion after yesterdays KIP hangout. For
>> > > anyone who
>> > > > > > > did not join the meeting, we have 2 different metrics packages
>> > > being
>> > > > > used
>> > > > > > > by the clients (custom package) and the server (codahale). We
>> are
>> > > > > > > discussing whether to migrate the server to the new package.
>> > > > > > >
>> > > > > > > What information do we need in order to make a decision?
>> > > > > > >
>> > > > > > > Some pros of the new package:
>> > > > > > > - Using the most recent information by combining data from
>> > > previous and
>> > > > > > > current samples. I'm not sure how codahale does this so I'll
>> > > > > investigate.
>> > > > > > > - We can quota on anything we measure. This is pretty cool IMO.
>> > > I've
>> > > > > > > investigate the feasibility of adding this feature in codahale.
>> > > > > > > - Hierarchical metrics. For example: we can define a sensor for
>> > > overall
>> > > > > > > bytes-in/bytes-out and also per-client. Updating the client
>> > sensor
>> > > will
>> > > > > > > cause the global byte rate sensor to get modified too.
>> > > > > > >
>> > > > > > > What are some of the issues with codahale? One previous
>> > discussion
>> > > > > > > mentions high memory usage but I don't have any experience with
>> > it
>> > > > > > myself.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > > Aditya
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Thanks,
>> > > > > > Neha
>> > > > > >
>> > > > >
>> > >
>> > >
>> >
>>

Re: Metrics package discussion

Reply via email to