Re: [DISCUSSION] Runner agnostic metrics extractor?

Etienne Chauchot Mon, 26 Mar 2018 08:29:39 -0700
Hi guys,
As part of the work bellow I need the help of Google Dataflow engine 
maintainers:
AFAIK Dataflow being a cloud hosted engine, the related runner is very 
different from the others. It just submits a job
to the cloud hosted engine. So, no access to metrics container etc... from the 
runner. So I think that the MetricsPusher
(component responsible for merging metrics and pushing them to a sink backend) 
must not be instanciated in
DataflowRunner otherwise it would be more a client (driver) piece of code and 
we will lose all the interest of being
close to the execution engine (among other things instrumentation of the 
execution of the pipelines). 
I think that the MetricsPusher needs to be instanciated in the actual Dataflow 
engine. 
As a side note a java implementation of the MetricsPusher is available in the 
PR in runner-core.
We need a serialized (langage agnostic) version of the MetricsPusher that SDKs 
will generate out of user pipelineOptions
and that will be sent to runners for instanciation
Are you guys willing to instanciate it in dataflow engine?  I think it is 
important that dataflow is wired up in
addition to flink and spark for this new feature to ramp up (adding it to other 
runners) in Beam. 
If you agree, can someone from Google help on that? 
Thanks 
Etienne
Le mercredi 31 janvier 2018 à 14:01 +0100, Etienne Chauchot a écrit :
> Hi all,
> Just to let you know that I have just submitted the PR [1]:
> This PR adds a MetricsPusher discussed in this [2] document in scenario 3.b.
> It merges and pushes beam metrics at a configurable (via pipelineOptions) 
> frequency to a configurable sink. By default
> the sink is a DummySink also useful for tests. There is also a 
> HttpMetricsSink available that pushes to a http backend
> the json-serialized metrics results. We could imagine in the future many Beam 
> sinks to write to Ganglia, Graphite, ...
> Please note that these are not IOs because it needed to be transparent to the 
> user.
> The pusher only supports attempted metrics for now.
> 
> This feature is hosted in the runner-core module (as discussed in the doc) to 
> be used by all the runners. I have wired
> it up with Spark and Flink runners (this part was quite painful :) )
> If the PR is merged, I hope that runner experts would wire this feature up in 
> the other runners. Your help is needed
> guys :)
> Besides, there is nothing related to portability for now in this PR.
> Best!
> Etienne
> [1] https://github.com/apache/beam/pull/4548
> [2] https://s.apache.org/runner_independent_metrics_extraction
> 
> Le 11/12/2017 à 17:33, Etienne Chauchot a écrit :
> > Hi all, 
> > I sketched a little doc [1] about this subject. It tries to sum up the 
> > differences between the runners towards
> > metrics extraction and propose some possible designs to have a runner 
> > agnostic extraction of the metrics.
> > It is a 2 pages long doc, can you please comment it, and correct it if 
> > needed?
> > Thanks
> > Etienne
> > [1]: https://s.apache.org/runner_independent_metrics_extraction
> > 
> > Le 27/11/2017 à 18:17, Ben Chambers a écrit :
> > > I think discussing a runner agnostic way of configuring how metrics are
> > > extracted is a great idea -- thanks for bringing it up Etienne!
> > > 
> > > Using a thread that polls the pipeline result relies on the program that
> > > created and submitted the pipeline continuing to run (eg., no machine
> > > faults, network problems, etc.). For many applications, this isn't a good
> > > model (a Streaming pipeline may run for weeks, a Batch pipeline may be
> > > automatically run every hour, etc.).
> > > 
> > > Etienne's proposal of having something the runner pushes metrics too has
> > > the benefit of running in the same cluster as the pipeline, thus having 
> > > the
> > > same reliability benefits.
> > > 
> > > As noted, it would require runners to ensure that metrics were pushed into
> > > the extractor but from there it would allow a general configuration of how
> > > metrics are extracted from the pipeline and exposed to some external
> > > services.
> > > 
> > > Providing something that the runners could push metrics into and have them
> > > automatically exported seems like it would have several benefits:
> > >   1. It would provide a single way to configure how metrics are actually
> > > exported.
> > >   2. It would allow the runners to ensure it was reliably executed.
> > >   3. It would allow the runner to report system metrics directly (eg., if 
> > > a
> > > runner wanted to report the watermark, it could push that in directly).
> > > 
> > > -- Ben
> > > 
> > > On Mon, Nov 27, 2017 at 9:06 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> > > wrote:
> > > 
> > > > Hi all,
> > > > 
> > > > Etienne forgot to mention that we started a PoC about that.
> > > > 
> > > > What I started is to wrap the Pipeline creation to include a thread that
> > > > polls
> > > > periodically the metrics in the pipeline result (it's what I proposed 
> > > > when
> > > > I
> > > > compared with Karaf Decanter some time ago).
> > > > Then, this thread marshalls the collected metrics and send to a sink. At
> > > > the
> > > > end, it means that the harvested metrics data will be store in a backend
> > > > (for
> > > > instance elasticsearch).
> > > > 
> > > > The pro of this approach is that it doesn't require any change in the
> > > > core, it's
> > > > up to the user to use the PipelineWithMetric wrapper.
> > > > 
> > > > The cons is that the user needs to explicitly use the PipelineWithMetric
> > > > wrapper.
> > > > 
> > > > IMHO, it's good enough as user can decide to poll metrics for some
> > > > pipelines and
> > > > not for others.
> > > > 
> > > > Regards
> > > > JB
> > > > 
> > > > On 11/27/2017 04:56 PM, Etienne Chauchot wrote:
> > > > > Hi all,
> > > > > 
> > > > > I came by this ticket https://issues.apache.org/jira/browse/BEAM-2456.
> > > > I know
> > > > > that the metrics subject has already been discussed a lot, but I would
> > > > like to
> > > > > revive the discussion.
> > > > > 
> > > > > The aim in this ticket is to avoid relying on the runner to provide 
> > > > > the
> > > > metrics
> > > > > because they don't have all the same capabilities towards metrics. The
> > > > idea in
> > > > > the ticket is to still use beam metrics API (and not others like
> > > > codahale as it
> > > > > has been discussed some time ago) and provide a way to extract the
> > > > metrics with
> > > > > a polling thread that would be forked by a PipelineWithMetrics (so,
> > > > almost
> > > > > invisible to the end user) and then to push to a sink (such as a Http
> > > > rest sink
> > > > > for example or Graphite sink or anything else...). Nevertheless, a
> > > > polling
> > > > > thread might not work for all the runners because some might not make 
> > > > > the
> > > > > metrics available before the end of the pipeline. Also, forking a 
> > > > > thread
> > > > would
> > > > > be a bit unconventional, so it could be provided as a beam sdk 
> > > > > extension.
> > > > > 
> > > > > Another way, to avoid polling, would be to push metrics values to a 
> > > > > sink
> > > > when
> > > > > they are updated but I don't know if it is feasible in a runner
> > > > independent way.
> > > > > WDYT about the ideas in this ticket?
> > > > > 
> > > > > Best,
> > > > > Etienne
> > > > --
> > > > Jean-Baptiste Onofré
> > > > jbono...@apache.org
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > > > 
> >  
>
Re: [DISCUSSION] Runner agnostic metrics extractor?

Reply via email to