If I recall from when the metrics were introduced ( http://s.apache.org/beam-metrics-api) the intention of the Distribution metric was to allow the representation to be more flexible. The name was chosen to be more abstract, so a runner could track the data in its own way. Specifically Distribution was supposed to serve the same purpose as dropwizard Histogram. Is it possible to extend it in a backwards-compatible way, or no?
Kenn On Wed, Sep 15, 2021 at 10:37 AM Ajo Thomas <ajo.thoma...@gmail.com> wrote: > Thanks for the response, Alexey and Ke. > Agree with your point to introduce a new metric type (say Percentiles) > instead of altering the Distribution metric type to ensure compatibility > across runners and sdks. > I am currently working on a prototype to add this new metric type to the > metrics API and testing it with samza runner. I can share a design doc with > the community with possible solutions very soon. > > Thanks > Ajo > > On Wed, Sep 15, 2021 at 9:26 AM Alexey Romanenko <aromanenko....@gmail.com> > wrote: > >> I agree with Ke Wu in the way that we need to keep compatibility across >> all runners and the same metrics. So, it seems that it would be better to >> create another metric type in this case. >> >> Also, to discuss it in details, I’d recommend to create a design document >> with possible solutions and examples. >> >> — >> Alexey >> >> On 14 Sep 2021, at 19:04, Ke Wu <ke.wu...@gmail.com> wrote: >> >> I prefer adding a new metrics type instead of enhancing the existing >> Distribution [1] to support percentiles etc in order to ensure better >> compatibility. >> >> @Luke @Kyle what are your thoughts on this? >> >> Best, >> Ke >> >> [1] >> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/Distribution.java >> >> >> On Sep 7, 2021, at 1:28 PM, Ajo Thomas <ajo.thoma...@gmail.com> wrote: >> >> Hi All, >> >> I am working on adding support for some additional distribution metrics >> like std dev, percentiles to the Metrics API. The runner of interest here >> is Samza runner. I wanted to get the opinion of fellow beam devs on this. >> >> One way to do this would be to make changes to the existing Distribution >> metric: >> - Add additional metrics to Distribution metric- custom percentiles, std >> dev, mean. Use Dropwizard Histogram under the hood in DistributionData to >> track the distribution of the data. >> - This also means changes to accompanying classes like DistributionData, >> DistributionResult which might involve runner specific changes. >> >> Is this an acceptable change or would you suggest something else? Is the >> Distribution metric only intended to track the metrics that it is currently >> tracking- sum, min, max, count? >> >> Thanks >> Ajo >> >> >> >>