[
https://issues.apache.org/jira/browse/KAFKA-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Klukas updated KAFKA-3714:
-------------------------------
Issue Type: Improvement (was: Bug)
> Allow users greater access to register custom streams metrics
> -------------------------------------------------------------
>
> Key: KAFKA-3714
> URL: https://issues.apache.org/jira/browse/KAFKA-3714
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Reporter: Jeff Klukas
> Assignee: Guozhang Wang
> Priority: Minor
> Fix For: 0.10.1.0
>
>
> Copying in some discussion that originally appeared in
> https://github.com/apache/kafka/pull/1362#issuecomment-219064302
> Kafka Streams is largely a higher-level abstraction on top of producers and
> consumers, and it seems sensible to match the KafkaStreams interface to that
> of KafkaProducer and KafkaConsumer where possible. For producers and
> consumers, the metric registry is internal and metrics are only exposed as an
> unmodifiable map. This allows users to access client metric values for use in
> application health checks, etc., but doesn't allow them to register new
> metrics.
> That approach seems reasonable if we assume that a user interested in
> defining custom metrics is already going to be using a separate metrics
> library. In such a case, users will likely find it easier to define metrics
> using whatever library they're familiar with rather than learning the API for
> Kafka's Metrics class. Is this a reasonable assumption?
> If we want to expose the Metrics instance so that users can define arbitrary
> metrics, I'd argue that there's need for documentation updates. In
> particular, I find the notion of metric tags confusing. Tags can be defined
> in a MetricConfig when the Metrics instance is constructed,
> StreamsMetricsImpl is maintaining its own set of tags, and users can set tag
> overrides.
> If a user were to get access to the Metrics instance, they would be missing
> the tags defined in StreamsMetricsImpl. I'm imagining that users would want
> their custom metrics to sit alongside the predefined metrics with the same
> tags, and users shouldn't be expected to manage those additional tags
> themselves.
> So, why are we allowing users to define their own metrics via the
> StreamsMetrics interface in the first place? Is it that we'd like to be able
> to provide a built-in latency metric, but the definition depends on the
> details of the use case so there's no generic solution? That would be
> sufficient motivation for this special case of addLatencySensor. If we want
> to continue down that path and give users access to define a wider range of
> custom metrics, I'd prefer to extend the StreamsMetrics interface so that
> users can call methods on that object, automatically getting the tags
> appropriate for that instance rather than interacting with the raw Metrics
> instance.
> ---
> Guozhang had the following comments:
> 1) For the producer/consumer cases, all internal metrics are provided and
> abstracted from users, and they just need to read the documentation to poll
> whatever provided metrics that they are interested; and if they want to
> define more metrics, they are likely to be outside the clients themselves and
> they can use whatever methods they like, so Metrics do not need to be exposed
> to users.
> 2) For streams, things are a bit different: users define the computational
> logic, which becomes part of the "Streams Client" processing and may be of
> interests to be monitored by user themselves; think of a customized processor
> that sends an email to some address based on a condition, and users want to
> monitor the average rate of emails sent. Hence it is worth considering
> whether or not they should be able to access the Metrics instance to define
> their own along side the pre-defined metrics provided by the library.
> 3) Now, since the Metrics class was not previously designed for public usage,
> it is not designed to be very user-friendly for defining sensors, especially
> the semantics differences between name / scope / tags. StreamsMetrics tries
> to hide some of these semantics confusion from users, but it still expose
> tags and hence is not perfect in doing so. We need to think of a better
> approach so that: 1) user defined metrics will be "aligned" (i.e. with the
> same name prefix within a single application, with similar scope hierarchy
> definition, etc) with library provided metrics, 2) natural APIs to do so.
> I do not have concrete ideas about 3) above on top of my head, comments are
> more than welcomed.
> ---
> I'm not sure that I agree that 1) and 2) are truly different situations. A
> user might choose to send email messages within a bare consumer rather than a
> streams application, and still want to maintain a metric of sent emails. In
> this bare consumer case, we'd expect the user to define that email-sent
> metric outside of Kafka's metrics machinery.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)