Re: Percentile metrics in Beam

Kenneth Knowles Fri, 17 Sep 2021 08:39:47 -0700

If I recall from when the metrics were introduced (
http://s.apache.org/beam-metrics-api) the intention of the Distribution
metric was to allow the representation to be more flexible. The name was
chosen to be more abstract, so a runner could track the data in its own
way. Specifically Distribution was supposed to serve the same purpose as
dropwizard Histogram. Is it possible to extend it in a backwards-compatible
way, or no?


Kenn

On Wed, Sep 15, 2021 at 10:37 AM Ajo Thomas <ajo.thoma...@gmail.com> wrote:

> Thanks for the response, Alexey and Ke.
> Agree with your point to introduce a new metric type (say Percentiles)
> instead of altering the Distribution metric type to ensure compatibility
> across runners and sdks.
> I am currently working on a prototype to add this new metric type to the
> metrics API and testing it with samza runner. I can share a design doc with
> the community with possible solutions very soon.
>
> Thanks
> Ajo
>
> On Wed, Sep 15, 2021 at 9:26 AM Alexey Romanenko <aromanenko....@gmail.com>
> wrote:
>
>> I agree with Ke Wu in the way that we need to keep compatibility across
>> all runners and the same metrics. So, it seems that it would be better to
>> create another metric type in this case.
>>
>> Also, to discuss it in details, I’d recommend to create a design document
>> with possible solutions and examples.
>>
>> —
>> Alexey
>>
>> On 14 Sep 2021, at 19:04, Ke Wu <ke.wu...@gmail.com> wrote:
>>
>> I prefer adding a new metrics type instead of enhancing the existing
>> Distribution [1] to support percentiles etc in order to ensure better
>> compatibility.
>>
>> @Luke @Kyle what are your thoughts on this?
>>
>> Best,
>> Ke
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/Distribution.java
>>
>>
>> On Sep 7, 2021, at 1:28 PM, Ajo Thomas <ajo.thoma...@gmail.com> wrote:
>>
>> Hi All,
>>
>> I am working on adding support for some additional distribution metrics
>> like std dev, percentiles to the Metrics API. The runner of interest here
>> is Samza runner. I wanted to get the opinion of fellow beam devs on this.
>>
>> One way to do this would be to make changes to the existing Distribution
>> metric:
>> - Add additional metrics to Distribution metric- custom percentiles, std
>> dev, mean. Use Dropwizard Histogram under the hood in DistributionData to
>> track the distribution of the data.
>> - This also means changes to accompanying classes like DistributionData,
>> DistributionResult which might involve runner specific changes.
>>
>> Is this an acceptable change or would you suggest something else? Is the
>> Distribution metric only intended to track the metrics that it is currently
>> tracking- sum, min, max, count?
>>
>> Thanks
>> Ajo
>>
>>
>>
>>

Re: Percentile metrics in Beam

Reply via email to