Hi all,
I came by this ticket https://issues.apache.org/jira/browse/BEAM-2456. I
know that the metrics subject has already been discussed a lot, but I
would like to revive the discussion.
The aim in this ticket is to avoid relying on the runner to provide the
metrics because they don't have all the same capabilities towards
metrics. The idea in the ticket is to still use beam metrics API (and
not others like codahale as it has been discussed some time ago) and
provide a way to extract the metrics with a polling thread that would be
forked by a PipelineWithMetrics (so, almost invisible to the end user)
and then to push to a sink (such as a Http rest sink for example or
Graphite sink or anything else...). Nevertheless, a polling thread might
not work for all the runners because some might not make the metrics
available before the end of the pipeline. Also, forking a thread would
be a bit unconventional, so it could be provided as a beam sdk extension.
Another way, to avoid polling, would be to push metrics values to a sink
when they are updated but I don't know if it is feasible in a runner
independent way.
WDYT about the ideas in this ticket?
Best,
Etienne