[ https://issues.apache.org/jira/browse/BEAM-3926?focusedWorklogId=106880&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106880 ]
ASF GitHub Bot logged work on BEAM-3926: ---------------------------------------- Author: ASF GitHub Bot Created on: 29/May/18 22:10 Start Date: 29/May/18 22:10 Worklog Time Spent: 10m Work Description: ajamato commented on a change in pull request #5437: [BEAM-3926] Add new metrics protos based on "Defining and adding SDK Metrics" htt… URL: https://github.com/apache/beam/pull/5437#discussion_r191590046 ########## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ########## @@ -257,6 +262,122 @@ message ProcessBundleProgressRequest { string instruction_reference = 1; } +message MonitoringInfo { + // The name defining the metric or monitored state. + string urn = 1; + + // This is specified as a URN that implies: + // A message class: (Distribution, Counter, Extrema, MonitoringDataTable). + // Sub types like field formats - int64, double, string. + // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION + // valid values are: + // beam:metrics:[SumInt64|LatestInt64|Top-NInt64|Bottom-NInt64| + // SumDouble|LatestDouble|Top-NDouble|Bottom-NDouble|DistributionInt64| + // DistributionDouble|MonitoringDataTable] + string type = 2; + + // The Metric or monitored state. + oneof monitoring_status { + MonitoringTableData monitored_table_data = 3; + Metric metric = 4; + } + + // A set of key+value labels which define the scope of the metric. + // Either a well defined entity id for the keys: + // “transform”, “pcollection”, “windowing_strategy”, + // “coder”, “environment” or any arbitrary label + // set by a custom metric or user metric. + // A monitoring system is expected to be able to aggregate the metric together + // for all updates having the same URN and labels. + // Some systems such as Stackdriver will be able to aggregate the metric + // using a subset of the provided labels + map<string, string> labels = 5; +} + +message Metric { + // (Required) The data for this metric. + oneof data { + CounterData counter_data = 1; + DistributionData distribution_data = 2; + Extrema extrema_data = 3; + } +} + +// Data associated with a Counter or Gauge metric. +// This is designed to be compatible with metric collection +// systems such as DropWizard. +message CounterData { + oneof value { + int64 int64_value = 1; + string string_value = 2; + double double_value = 3; + } +} + +// Extrema messages are used for calculating +// Top-N/Bottom-N metrics. +message Extrema { + // Only one of the two should be specified. + // Note: oneof is not allowed on repeated fields. + repeated int64 int_values = 1; + repeated double double_values = 2; +} + +// Data associated with a distribution metric. +// This is based off of the current DistributionData metric +// This is not a stackdriver or dropwizard compatible +// style of distribution metric. +message DistributionData { + oneof distribution { + IntDistributionData int_double_distribution = 1; Review comment: Done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 106880) Time Spent: 3h 50m (was: 3h 40m) > Support MetricsPusher in Dataflow Runner > ---------------------------------------- > > Key: BEAM-3926 > URL: https://issues.apache.org/jira/browse/BEAM-3926 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow > Reporter: Scott Wegner > Assignee: Pablo Estrada > Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > See [relevant email > thread|https://lists.apache.org/thread.html/2e87f0adcdf8d42317765f298e3e6fdba72917a72d4a12e71e67e4b5@%3Cdev.beam.apache.org%3E]. > From [~echauchot]: > > _AFAIK Dataflow being a cloud hosted engine, the related runner is very > different from the others. It just submits a job to the cloud hosted engine. > So, no access to metrics container etc... from the runner. So I think that > the MetricsPusher (component responsible for merging metrics and pushing them > to a sink backend) must not be instanciated in DataflowRunner otherwise it > would be more a client (driver) piece of code and we will lose all the > interest of being close to the execution engine (among other things > instrumentation of the execution of the pipelines). I think that the > MetricsPusher needs to be instanciated in the actual Dataflow engine._ > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)