Hi all,@Luke, @Alex I have a general question related to metrics in the Fn API:
as the communication between runner
harness and SDK harness is done on a bundle basis. When the runner harness
sends data to the sdk harness to execute a
transform that contains metrics, does it: 1. send metrics values (for the
ones defined in the transform) alongside
with data and receive an updated value of the metrics from the sdk harness when
the bundle is finished processing?
2. or does it send only the data and the sdk harness responds with a diff
value of the metrics so that the runner can
update them in its side?
My bet is option 2. But can you confirm?
Thanks
Etienne
Le jeudi 19 juillet 2018 à 15:10 +0200, Etienne Chauchot a écrit :
> Thanks for the confirmations Luke.
> Le mercredi 18 juillet 2018 à 07:56 -0700, Lukasz Cwik a écrit :
> > On Wed, Jul 18, 2018 at 7:01 AM Etienne Chauchot <[email protected]>
> > wrote:
> > > Hi,
> > > Luke, Alex, I have some portable metrics interrogations, can you confirm
> > > them ?
> > >
> > > 1 - As it is the SDK harness that will run the code of the UDFs, if a UDF
> > > defines a metric, then the SDK harness
> > > will give updates through GRPC calls to the runner so that the runner
> > > could update metrics cells, right?
> >
> > Yes.
> > > 2 - Alex, you mentioned in proto and design doc that there will be no
> > > aggreagation of metrics. But some runners
> > > (spark/flink) rely on accumulators and when they are merged, it triggers
> > > the merging of the whole chain to the
> > > metric cells. I know that Dataflow does not do the same, it uses non
> > > agregated metrics and sends them to an
> > > aggregation service. Will there be a change of paradigm with portability
> > > for runners that merge themselves ?
> >
> > There will be local aggregation of metrics scoped to a bundle; after the
> > bundle is finished processing they are
> > discarded. This will require some kind of global aggregation support from a
> > runner, whether that runner does it via
> > accumulators or via an aggregation service is up to the runner.
> > > 3 - Please confirm that the distinction between attempted and committed
> > > metrics is not the business of portable
> > > metrics. Indeed, it does not involve communication between the runner
> > > harness and the SDK harness as it is a
> > > runner only matter. I mean, when a runner commits a bundle it just
> > > updates its committed metrics and do not need
> > > to inform the SDK harness. But, of course, when the user requests
> > > committed metrics through the SDK, then the SDK
> > > harness will ask the runner harness to give them.
> > >
> > >
> > You are correct in saying that during execution, the SDK does not
> > differentiate between attempted and committed
> > metrics and only the runner does. We still lack an API definition and
> > contract for how an SDK would query for
> > metrics from a runner but your right in saying that an SDK could request
> > committed metrics and the Runner would
> > supply them some how.
> > > Thanks
> > > BestEtienne
> > >
> > >