Yes I agree with you and sorry for messing them together in this discussion. I just wonder if someone plans to support Meters/Histograms in the near future. If so, we might need to modify metrics a bit in beam sdk IMHO, that's the reason I started this discussion.
On Fri, Jun 23, 2017 at 3:21 PM, Jean-Baptiste Onofré <[email protected]> wrote: > Hi Codi, > > I think there are two "big" topics around metrics: > > - what we collect > - where we send the collected data > > The "generic metric sink" (BEAM-2456) is for the later: we don't really > change/touch the collected data (or maybe just in case of data format) we > send to the sink. > > The Meters/Histograms is both more the collected data IMHO. > > Regards > JB > > > On 06/23/2017 04:09 AM, Cody Innowhere wrote: > >> Hi JB, >> Glad to hear that. >> Still, I'm thinking about adding support of Meters & Histograms(maybe >> extending Distribution). As the discussion mentions, problem is that >> Meter/Histogram >> cannot be updated directly in current way because their internal data >> decays after time. Do you plan to refactor current implementation so that >> they can be supported while working on the generic metric sink? >> >> On Thu, Jun 22, 2017 at 9:37 PM, Jean-Baptiste Onofré <[email protected]> >> wrote: >> >> Hi >>> >>> Agree with Aviem and yes actually I'm working on a generic metric sink. I >>> created a Jira about that. I'm off today, I will send some details asap. >>> >>> Regards >>> JB >>> >>> On Jun 22, 2017, 15:16, at 15:16, Aviem Zur <[email protected]> wrote: >>> >>>> Hi Cody, >>>> >>>> Some of the runners have their own metrics sink, for example Spark >>>> runner >>>> uses Spark's metrics sink which you can configure to send the metrics >>>> to >>>> backends such as Graphite. >>>> >>>> There have been ideas floating around for a Beam metrics sink extension >>>> which will allow users to send Beam metrics to various metrics >>>> backends, I >>>> believe @JB is working on something along these lines. >>>> >>>> On Thu, Jun 22, 2017 at 2:00 PM Cody Innowhere <[email protected]> >>>> wrote: >>>> >>>> Hi guys, >>>>> Currently metrics are implemented in runners/core as CounterCell, >>>>> GaugeCell, DistributionCell, etc. If we want to send metrics to >>>>> >>>> external >>>> >>>>> systems via metrics reporter, we would have to define another set of >>>>> metrics, say, codahale metrics, and update codahale metrics >>>>> >>>> periodically >>>> >>>>> with beam sdk metrics, which is inconvenient and inefficient. >>>>> >>>>> Another problem is that Meter/Histogram cannot be updated directly in >>>>> >>>> this >>>> >>>>> way because their internal data decays after time. >>>>> >>>>> My opinion would be bridge beam sdk metrics to underlying runners so >>>>> >>>> that >>>> >>>>> updates would directly apply to underlying runners (Flink, Spark, >>>>> >>>> etc) >>>> >>>>> without conversion. >>>>> >>>>> Specifically, currently we already delegate >>>>> Metrics.counter/gauge/distribution to >>>>> >>>> DelegatingCounter/Gauge/Distribution, >>>> >>>>> which uses MetricsContainer to store the actual metrics with the >>>>> implementation of MetricsContainerImpl. If we can add an API in >>>>> MetricsEnvironment to allow runners to override the default >>>>> >>>> implementation, >>>> >>>>> say, for flink, we have FlinkMetricsContainerImpl, then all metric >>>>> >>>> updates >>>> >>>>> will directly apply to metrics in FlinkMetricsContainerImpl without >>>>> intermediate conversion and updates. And since the metrics are >>>>> runner-specific, it would be a lot easier to support metrics >>>>> >>>> reporters as >>>> >>>>> well as Meters/Histograms. >>>>> >>>>> What do you think? >>>>> >>>>> >>> >> > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
