Github user jtuple commented on the issue:
https://github.com/apache/zookeeper/pull/580
Sorry for the delay, August was a super busy month.
I've rebased onto latest master + added basic unit tests for
`AvgMinMaxCounter` and `SimpleCounter`. I also resolved the findbugs issue and
the `testMonitor` unit tests. Jenkin still seems unhappy, but I'm not sure
what's up -- the linked test results show no failures.
I've also addressed the majority of inline comments. Please let me know if
there's anything else I overlooked.
---
As mentioned in a prior comment, this pull request is just the first of
many. In this PR we add the `AvgMinMaxCounter` and `SimpleCounter` metrics.
We have additional metrics internally that we'll upstream in the future,
including `AvgMinMaxPercentileCounter` as well as set variants:
`AvgMinMaxCounterSet` and `AvgMinMaxPercentileCounterSet`. I talked about both
of these in [a previous
comment](https://github.com/apache/zookeeper/pull/580#issuecomment-410340039)
Those are all the types we currently have internally. Other types like a
Gauge could easily be done in a future pull-request, although our experience
has been that gauge's are less useful in a polling based approach since you
only see the most recent value. Something like an `AvgMinMaxCounter` metric
gives you more information when there are multiple events, and trivially
reduces to a gauge-equivalent when there's only a single value during the
sampling interval.
For things like commit processor queue sizes and the like, we simply query
the queue size every time we enqueue an item and report that instantaneous size
as a value in an `AvgMinMaxCounter`. When we periodically query metrics, we
then know the min/max/avg queue size during the interval, and as well as the
population count which is equivalent to "number of enqueues" during the
interval.
---
@hanm: For your example, you have a few options:
1. Do as you mentioned and add a new metric to the `ServerMetric` enum for
each request type.
2. Once we land the set-variants, you could add a single metric to the
`ServerMetric` enum and rely upon the set logic. Eg. add a `REQUEST_LATENCY(new
AvgMinMaxCounterSet("request_latency"))` metric and then do
`ServerMetrics.REQUEST_LATENCY.add("create_session", latency)`,
`ServerMetrics.REQUEST_LATENCY.add("set_data", latency)`, etc. For your
example, you wanted to track counts so you'd probably want a `SimpleCounterSet`
which we don't have, but would be easy to create.
3. Use something custom. For example, the majority of our internal metrics
build on `ServerMetrics`, but we do have a handful that don't. One example that
we'll be upstreaming soon is our so-called "request state" metrics that track
the count for requests at different stages in the request pipeline. We maintain
a count of requests queued, issued, and completed for different category of
requests (all, read, write, create_session, close_session, sync, auth,
set_watches, other). This matrix is also added to the output of `/metrics` but
doesn't use any of the `ServerMetrics` logic.
---