[
https://issues.apache.org/jira/browse/KAFKA-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146681#comment-17146681
]
Sophie Blee-Goldman commented on KAFKA-10177:
-
I haven't personally looked into HrdHistogram specifically, but I think the
approach of porting over an existing and well-tested implementation is the
right way to go.
It's probably not worth an extra dependency and shouldn't be too complicated to
re-implement a reasonable percentiles algorithm (famous last words, I know...)
> Replace/improve Percentiles metrics
> ---
>
> Key: KAFKA-10177
> URL: https://issues.apache.org/jira/browse/KAFKA-10177
> Project: Kafka
> Issue Type: Improvement
> Components: metrics
>Reporter: Sophie Blee-Goldman
>Priority: Major
>
> There's an existing – but seemingly unused – implementation of percentile
> metrics that we attempted to use for end-to-end latency metrics in Streams.
> Unfortunately a number of limitations became apparent, and we ultimately
> pulled the metrics from the 2.6 release pending further
> investigation/improvement.
> The problems we encountered were
> # Need to set a static upper/lower limit for the values
> # Not well suited to a distribution with a long tail, ie setting the max
> value too high caused the accuracy to plummet
> # Required a lot of memory per metric for reasonable accuracy and caused us
> to hit OOM (unclear if there was actually a memory leak, or it was just
> gobbling up unnecessarily large amounts in general)
> Since the Percentiles class is part of the public API, we may need to create
> a new class altogether and possibly deprecate/remove the old one.
> Alternatively we can consider just re-implementing the existing class from
> scratch, and just deprecating the current constructors and associated
> implementation (eg the constructor accepts a max)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)