Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

Asaf Mesika Tue, 09 May 2023 11:04:16 -0700

Your feedback made me realized I need to add "TL;DR" section, which I just
added.


I'm quoting it here. It gives a brief summary of the proposal, which
requires up to 5 min of read time, helping you get a high level picture
before you dive into the background/motivation/solution.

----------------------
TL;DR

Working with Metrics today as a user or a developer is hard and has many
severe issues.

>From the user perspective:

   - One of Pulsar strongest feature is "cheap" topics so you can easily
   have 10k - 100k topics per broker. Once you do that, you quickly learn that
   the amount of metrics you export via "/metrics" (Prometheus style endpoint)
   becomes really big. The cost to store them becomes too high, queries
   time-out or even "/metrics" endpoint it self times out.
   The only option Pulsar gives you today is all-or-nothing filtering and
   very crude aggregation. You switch metrics from topic aggregation level to
   namespace aggregation level. Also you can turn off producer and consumer
   level metrics. You end up doing it all leaving you "blind", looking at the
   metrics from a namespace level which is too high level. You end up
   conjuring all kinds of scripts on top of topic stats endpoint to glue some
   aggregated metrics view for the topics you need.
   - Summaries (metric type giving you quantiles like p95) which are used
   in Pulsar, can't be aggregated across topics / brokers due its inherent
   design.
   - Plugin authors spend too much time on defining and exposing metrics to
   Pulsar since the only interface Pulsar offers is writing your metrics by
   your self as UTF-8 bytes in Prometheus Text Format to byte stream interface
   given to you.
   - Pulsar histograms are exported in a way that is not conformant with
   Prometheus, which means you can't get the p95 quantile on such histograms,
   making them very hard to use in day to day life.
   - Too many metrics are rates which also delta reset every interval you
   configure in Pulsar and restart, instead of relying on cumulative (ever
   growing) counters and let Prometheus use its rate function.
   - and many more issues

>From the developer perspective:

   - There are 4 different ways to define and record metrics in Pulsar:
   Pulsar own metrics library, Prometheus Java Client, Bookkeeper metrics
   library and plain native Java SDK objects (AtomicLong, ...). It's very
   confusing for the developer and create inconsistencies for the end user
   (e.g. Summary for example is different in each).
   - Patching your metrics into "/metrics" Prometheus endpoint is
   confusing, cumbersome and error prone.
   - many more

This proposal offers several key changes to solve that:

   - Cardinality (supporting 10k-100k topics per broker) is solved by
   introducing a new aggregation level for metrics called Topic Metric Group.
   Using configuration, you specify for each topic its group (using
   wildcard/regex). This allows you to "zoom" out to a more detailed
   granularity level like groups instead of namespaces, which you control how
   many groups you'll have hence solving the cardinality issue, without
   sacrificing level of detail too much.
   - Fine-grained filtering mechanism, dynamic. You'll have rule-based
   dynamic configuration, allowing you to specify per namespace/topic/group
   which metrics you'd like to keep/drop. Rules allows you to set the default
   to have small amount of metrics in group and namespace level only and drop
   the rest. When needed, you can add an override rule to "open" up a certain
   group to have more metrics in higher granularity (topic or even
   consumer/producer level). Since it's dynamic you "open" such a group when
   you see it's misbehaving, see it in topic level, and when all resolved, you
   can "close" it. A bit similar experience to logging levels in Log4j or
   Logback, that you default and override per class/package.

Aggregation and Filtering combined solves the cardinality without
sacrificing the level of detail when needed and most importantly, you
determine which topic/group/namespace it happens on.

Since this change is so invasive, it requires a single metrics library to
implement all of it on top of; Hence the third big change point is
consolidating all four ways to define and record metrics to a single one, a
new one: OpenTelemtry Metrics (Java SDK, and also Python and Go for the
Pulsar Function runners).
Introducing OpenTelemetry (OTel) solves also the biggest pain point from
the developer perspective, since it's a superb metrics library offering
everything you need, and there is going to be a single way - only it. Also,
it solves the robustness for Plugin author which will use OpenTelemetry. It
so happens that it also solves all the numerous problems described in the
doc itself.

The solution will be introduced as another layer with feature toggles, so
you can work with existing system, and/or OTel, until gradually deprecating
existing system.

It's a big breaking change for Pulsar users on many fronts: names,
semantics, configuration. Read at the end of this doc to learn exactly what
will change for the user (in high level).

In my opinion, it will make Pulsar user experience so much better, they
will want to migrate to it, despite the breaking change.

This was a very short summary. You are most welcomed to read the full
design document below and express feedback, so we can make it better.

On Sun, May 7, 2023 at 7:52 PM Asaf Mesika <asaf.mes...@gmail.com> wrote:

>
>
> On Sun, May 7, 2023 at 4:23 PM Yunze Xu <y...@streamnative.io.invalid>
> wrote:
>
>> I'm excited to learn much more about metrics when I started reading
>> this proposal. But I became more and more frustrated when I found
>> there is still too much content left even if I've already spent much
>> time reading this proposal. I'm wondering how much time did you expect
>> reviewers to read through this proposal? I just recalled the
>> discussion you started before [1]. Did you expect each PMC member that
>> gives his/her +1 to read only parts of this proposal?
>>
>
> I estimated around 2 hours needed for a reviewer.
> I hate it being so long, but I simply couldn't find a way to downsize it
> more. Furthermore, I consulted with my colleagues including Matteo, but we
> couldn't see a way to scope it down.
> Why? Because once you begin this journey, you need to know how it's going
> to end.
> What I ended up doing, is writing all the crucial details for review in
> the High Level Design section.
> It's still a big, hefty section, but I don't think I can step out or let
> anyone else change Pulsar so invasively without the full extent of the
> change.
>
> I don't think it's wise to read parts.
> I did my very best effort to minimize it, but the scope is simply big.
> Open for suggestions, but it requires reading all the PIP :)
>
> Thanks a lot Yunze for dedicating any time to it.
>
>
>
>
>>
>> Let's talk back to the proposal, for now, what I mainly learned and
>> are concerned about mostly are:
>> 1. Pulsar has many ways to expose metrics. It's not unified and confusing.
>> 2. The current metrics system cannot support a large amount of topics.
>> 3. It's hard for plugin authors to integrate metrics. (For example,
>> KoP [2] integrates metrics by implementing the
>> PrometheusRawMetricsProvider interface and it indeed needs much work)
>>
>> Regarding the 1st issue, this proposal chooses OpenTelemetry (OTel).
>>
>> Regarding the 2nd issue, I scrolled to the "Why OpenTelemetry?"
>> section. It's still frustrating to see no answer. Eventually, I found
>>
>
> OpenTelemetry isn't the solution for large amount of topic.
> The solution is described at
> "Aggregate and Filtering to solve cardinality issues" section.
>
>
>
>> the explanation in the "What we need to fix in OpenTelemetry -
>> Performance" section. It seems that we still need some enhancements in
>> OTel. In other words, currently OTel is not ready for resolving all
>> these issues listed in the proposal but we believe it will.
>>
>
> Let me rephrase "believe" --> we work together with the maintainers to do
> it, yes.
> I am open for any other suggestion.
>
>
>
>>
>> As for the 3rd issue, from the "Integrating with Pulsar Plugins"
>> section, the plugin authors still need to implement the new OTel
>> interfaces. Is it much easier than using the existing ways to expose
>> metrics? Could metrics still be easily integrated with Grafana?
>>
>
> Yes, it's way easier.
> Basically you have a full fledged metrics library objects: Meter, Gauge,
> Histogram, Counter.
> No more Raw Metrics Provider, writing UTF-8 bytes in Prometheus format.
> You get namespacing for free with Meter name and version.
> It's way better than current solution and any other library.
>
>
>>
>> That's all I am concerned about at the moment. I understand, and
>> appreciate that you've spent much time studying and explaining all
>> these things. But, this proposal is still too huge.
>>
>
> I appreciate your effort a lot!
>
>
>
>>
>> [1] https://lists.apache.org/thread/04jxqskcwwzdyfghkv4zstxxmzn154kf
>> [2]
>> https://github.com/streamnative/kop/blob/master/kafka-impl/src/main/java/io/streamnative/pulsar/handlers/kop/stats/PrometheusMetricsProvider.java
>>
>> Thanks,
>> Yunze
>>
>> On Sun, May 7, 2023 at 5:53 PM Asaf Mesika <asaf.mes...@gmail.com> wrote:
>> >
>> > I'm very appreciative for feedback from multiple pulsar users and devs
>> on
>> > this PIP, since it has dramatic changes suggested and quite extensive
>> > positive change for the users.
>> >
>> >
>> > On Thu, Apr 27, 2023 at 7:32 PM Asaf Mesika <asaf.mes...@gmail.com>
>> wrote:
>> >
>> > > Hi all,
>> > >
>> > > I'm very excited to release a PIP I've been working on in the past 11
>> > > months, which I think will be immensely valuable to Pulsar, which I
>> like so
>> > > much.
>> > >
>> > > PIP: https://github.com/apache/pulsar/issues/20197
>> > >
>> > > I'm quoting here the preface:
>> > >
>> > > === QUOTE START ===
>> > >
>> > > Roughly 11 months ago, I started working on solving the biggest issue
>> with
>> > > Pulsar metrics: the lack of ability to monitor a pulsar broker with a
>> large
>> > > topic count: 10k, 100k, and future support of 1M. This started by
>> mapping
>> > > the existing functionality and then enumerating all the problems I
>> saw (all
>> > > documented in this doc
>> > > <
>> https://docs.google.com/document/d/1vke4w1nt7EEgOvEerPEUS-Al3aqLTm9cl2wTBkKNXUA/edit?usp=sharing
>> >
>> > > ).
>> > >
>> > > This PIP is a parent PIP. It aims to gradually solve (using sub-PIPs)
>> all
>> > > the current metric system's problems and provide the ability to
>> monitor a
>> > > broker with a large topic count, which is currently lacking. As a
>> parent
>> > > PIP, it will describe each problem and its solution at a high level,
>> > > leaving fine-grained details to the sub-PIPs. The parent PIP ensures
>> all
>> > > solutions align and does not contradict each other.
>> > >
>> > > The basic building block to solve the monitoring ability of large
>> topic
>> > > count is aggregating internally (to topic groups) and adding
>> fine-grained
>> > > filtering. We could have shoe-horned it into the existing metric
>> system,
>> > > but we thought adding that to a system already ingrained with many
>> problems
>> > > would be wrong and hard to do gradually, as so many things will
>> break. This
>> > > is why the second-biggest design decision presented here is
>> consolidating
>> > > all existing metric libraries into a single one - OpenTelemetry
>> > > <https://opentelemetry.io/>. The parent PIP will explain why
>> > > OpenTelemetry was chosen out of existing solutions and why it far
>> exceeds
>> > > all other options. I’ve been working closely with the OpenTelemetry
>> > > community in the past eight months: brain-storming this integration,
>> and
>> > > raising issues, in an effort to remove serious blockers to make this
>> > > migration successful.
>> > >
>> > > I made every effort to summarize this document so that it can be
>> concise
>> > > yet clear. I understand it is an effort to read it and, more so,
>> provide
>> > > meaningful feedback on such a large document; hence I’m very grateful
>> for
>> > > each individual who does so.
>> > >
>> > > I think this design will help improve the user experience immensely,
>> so it
>> > > is worth the time spent reading it.
>> > >
>> > >
>> > > === QUOTE END ===
>> > >
>> > >
>> > > Thanks!
>> > >
>> > > Asaf Mesika
>> > >
>>
>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

Reply via email to