[ https://issues.apache.org/jira/browse/BEAM-10928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Beam JIRA Bot updated BEAM-10928: --------------------------------- Labels: stale-P2 (was: ) > FlinkDistributionGauge and FlinkGauge metrics are exported as zero to > Prometheus when using any Flink's PrometheusReporter > -------------------------------------------------------------------------------------------------------------------------- > > Key: BEAM-10928 > URL: https://issues.apache.org/jira/browse/BEAM-10928 > Project: Beam > Issue Type: Bug > Components: runner-flink > Affects Versions: 2.23.0 > Reporter: Ivan San Jose > Priority: P2 > Labels: stale-P2 > > To be honest I'm really lost on this one, let me explain the issue: > Beam has its own metrics types (org/apache/beam/sdk/metrics/Metrics.java) > \-counter, distribution, and gauge\-, and, depending on the runner, wraps > them into their corresponding runner types. For example, for Flink, Beam is > wrapping its Gauge type into a class called FlinkGauge which extends a > Gauge<Long>. > Also, Beam's Distribution metric its wrapped into a Flink's > Gauge<DistributionResult>, where DistributionResult is a Beam type containing > min,max,sum,count. > Then, if you are using Flink, and you want to export those metrics to > Prometheus, using flink-metrics-prometheus, you will see that they are always > zero, and, if you set DEBUG log level for > "org.apache.flink.metrics.prometheus" package, you will see errors like > following ones: > {code} > 2020-09-18 06:27:04,387 DEBUG Invalid type for Gauge > org.apache.beam.runners.flink.metrics.FlinkMetricContainer$FlinkDistributionGauge@30211d3f: > org.apache.beam.sdk.metrics.AutoValue_DistributionResult, only number types > and booleans are supported by this reporter. > 2020-09-18 06:27:04,394 DEBUG Invalid type for Gauge > org.apache.beam.runners.flink.metrics.FlinkMetricContainer$FlinkGauge@2ad1562: > org.apache.beam.sdk.metrics.AutoValue_GaugeResult, only number types and > booleans are supported by this reporter. > {code} > Which is really weird, because if you check the source code of > AbstractPrometheusReporter, you can see that is taking the metric value from > Flink's Gauge using getValue(): > https://github.com/apache/flink/blob/master/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L225 > And FlinkGauge.getValue() should return a long instead of > org.apache.beam.sdk.metrics.AutoValue_GaugeResult. So I don't understand what > is happening there to be honest. May be AutoValue mechanism is messing things > up? -- This message was sent by Atlassian Jira (v8.3.4#803005)