On Wed, Nov 13, 2019 at 10:56 AM Maximilian Michels <m...@apache.org> wrote: > > > Are you referring specifically to? > > * beam:metric:element_count:v1 > > * beam:metric:pardo_execution_time:start_bundle_msecs:v1 > > * beam:metric:pardo_execution_time:process_bundle_msecs:v1 > > * beam:metric:pardo_execution_time:finish_bundle_msecs:v1 > > * beam:metric:ptransform_execution_time:total_msecs:v1 > > Yes. > > > Would the gauge be grouped per element or per bundle? > > Per bundle. These are reported when the bundle finishes. > > > If grouped at the bundle level the metrics are arbitrary to the user since > > the bundle size is chosen by the runner. > > Not necessarily because the bundle size is typically fixed (at least in > the Flink Runner). In any case, it provides information about how much > activity occurred in a bundle which is useful to know. > > > There is also a very significant overhead for tracking low level metrics > > I can't imagine tracking a per-bundle element count or execution time is > that expensive. Maybe I'm wrong.
These are element counts and execution time per operation (e.g. per DoFn). FWIW, process_bundle_msecs is mis-named, it should be "process_element" or just "process" as it refers to the time spend in that method. beam:metric:ptransform_execution_time:total_msecs:v1 seems redundant with the sum of the others. (Unless it includes setup/teardown, which it seems are missing as separate values?) I think what you want is new metrics associated with the bundle + executable stage as a whole. Distribution metrics would make the most sense here. (Gauge metrics would just report the value of whatever bundle finished last...) I don't know how they'd be named, perhaps they'd be labeled with the full set of transforms that the stage contains (which is of course not stable)? > On 13.11.19 18:58, Luke Cwik wrote: > > Are you referring specifically to? > > * beam:metric:element_count:v1 > > * beam:metric:pardo_execution_time:start_bundle_msecs:v1 > > * beam:metric:pardo_execution_time:process_bundle_msecs:v1 > > * beam:metric:pardo_execution_time:finish_bundle_msecs:v1 > > * beam:metric:ptransform_execution_time:total_msecs:v1 > > > > Would the gauge be grouped per element or per bundle? > > If grouped at the bundle level the metrics are arbitrary to the user > > since the bundle size is chosen by the runner. > > If grouped at the element level then only a few of the metrics make sense: > > * element_count becomes number of outputs per input element > > * process_bundle_msecs becomes amount of time to process a single input > > element (does this still apply to elements that can be split?) > > > > There is also a very significant overhead for tracking low level metrics > > in great detail which is why timing is done through a sampling > > technique. I'm sure if we could do it cheaply then it would make sense > > to get those metrics. This is also a place where we want each SDK to > > implement these metrics so complexity may slow down SDK authors from > > developing them. > > > > > > On Wed, Nov 13, 2019 at 5:13 AM Maximilian Michels <m...@apache.org > > <mailto:m...@apache.org>> wrote: > > > > Hi, > > > > We have a series of builtin PTransform/PCollection metrics: > > > > https://github.com/apache/beam/blob/808cb35018cd228a59b152234b655948da2455fa/model/pipeline/src/main/proto/metrics.proto#L74 > > > > Why are those of counters ("beam:metrics:sum_int_64")? I think the > > better default type for most users would be gauge > > ("beam:metrics:latest_int_64"). > > > > I understand that counters are useful because they retain the sum of > > all > > reported values, but for getting an idea about the deviation of a > > metric, gauges could be more useful. > > > > Perhaps we could make this configurable? > > > > Thanks, > > Max > >