Hi all,
I'm currently in the process of adding some metrics to an existing pipeline
that runs on Google Dataproc via Spark and I'm trying to determine how to
access these metrics and eventually expose them to Stackdriver (to be used
downstream in Grafana dashboards).
The metrics themselves are fairly simple (a series of counters) and are defined
as such (and accessed in DoFns throughout the pipeline):
```
/** Metrics gathered during Event-related transforms */
private object Metrics {
// This is used to keep track of any dynamically added data sources and
their counts
val totalMessages: Counter =
BeamMetrics.counter(Events::class.qualifiedName, "messages_total")
}
```
After initially running the pipeline in Dataproc, I wasn't able to see anything
that specifically indicated that the metrics were being exposed at all. I
haven't added any specific configuration to handle this within the pipeline
itself, however I did notice an interface that I may need to consider
implementing called MetricOptions:
```
interface MyPipelineOptions : ... MetricsOptions { ... }
```
So my questions primarily center around:
- Will Metrics be emitted automatically? Or do I need to explicitly implement
the MetricsOptions interface for the pipeline?
- Does anyone have any experience with handling this (i.e. Pipeline > Metrics >
Stackdriver)? I'd imagine since this is all self-contained within GCP (Dataproc
+ Stackdriver), that it wouldn't be too rough to hand that baton off.
Any advice / articles / examples would be greatly appreciated!
Thanks,
Rion