Thanks for the response, Alexey and Ke. Agree with your point to introduce a new metric type (say Percentiles) instead of altering the Distribution metric type to ensure compatibility across runners and sdks. I am currently working on a prototype to add this new metric type to the metrics API and testing it with samza runner. I can share a design doc with the community with possible solutions very soon.
Thanks Ajo On Wed, Sep 15, 2021 at 9:26 AM Alexey Romanenko <aromanenko....@gmail.com> wrote: > I agree with Ke Wu in the way that we need to keep compatibility across > all runners and the same metrics. So, it seems that it would be better to > create another metric type in this case. > > Also, to discuss it in details, I’d recommend to create a design document > with possible solutions and examples. > > — > Alexey > > On 14 Sep 2021, at 19:04, Ke Wu <ke.wu...@gmail.com> wrote: > > I prefer adding a new metrics type instead of enhancing the existing > Distribution [1] to support percentiles etc in order to ensure better > compatibility. > > @Luke @Kyle what are your thoughts on this? > > Best, > Ke > > [1] > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/Distribution.java > > > On Sep 7, 2021, at 1:28 PM, Ajo Thomas <ajo.thoma...@gmail.com> wrote: > > Hi All, > > I am working on adding support for some additional distribution metrics > like std dev, percentiles to the Metrics API. The runner of interest here > is Samza runner. I wanted to get the opinion of fellow beam devs on this. > > One way to do this would be to make changes to the existing Distribution > metric: > - Add additional metrics to Distribution metric- custom percentiles, std > dev, mean. Use Dropwizard Histogram under the hood in DistributionData to > track the distribution of the data. > - This also means changes to accompanying classes like DistributionData, > DistributionResult which might involve runner specific changes. > > Is this an acceptable change or would you suggest something else? Is the > Distribution metric only intended to track the metrics that it is currently > tracking- sum, min, max, count? > > Thanks > Ajo > > > >