Re: [portablility] metrics interrogations

Robert Burke Mon, 10 Sep 2018 08:41:30 -0700

The way I entered them into the Go SDK is #2 (SDK sends diffs per bundle)
and the Java Runner Harness appears to aggregate them correctly from there.


On Mon, Sep 10, 2018, 2:07 AM Etienne Chauchot <[email protected]> wrote:

> Hi all,
>
> @Luke, @Alex I have a general question related to metrics in the Fn API:
> as the communication between runner harness and SDK harness is done on a
> bundle basis. When the runner harness sends data to the sdk harness to
> execute a transform that contains metrics, does it:
>
>    1. send metrics values (for the ones defined in the transform)
>    alongside with data and receive an updated value of the metrics from the
>    sdk harness when the bundle is finished processing?
>    2. or does it send only the data and the sdk harness responds with a
>    diff value of the metrics so that the runner can update them in its side?
>
> My bet is option 2. But can you confirm?
>
> Thanks
>
> Etienne
>
> Le jeudi 19 juillet 2018 à 15:10 +0200, Etienne Chauchot a écrit :
>
> Thanks for the confirmations Luke.
>
> Le mercredi 18 juillet 2018 à 07:56 -0700, Lukasz Cwik a écrit :
>
>
>
> On Wed, Jul 18, 2018 at 7:01 AM Etienne Chauchot <[email protected]>
> wrote:
>
> Hi,
> Luke, Alex, I have some portable metrics interrogations, can you confirm
> them ?
>
> 1 - As it is the SDK harness that will run the code of the UDFs, if a UDF
> defines a metric, then the SDK harness will give updates through GRPC calls
> to the runner so that the runner could update metrics cells, right?
>
>
> Yes.
>
>
>
> 2 - Alex, you mentioned in proto and design doc that there will be no
> aggreagation of metrics. But some runners (spark/flink) rely on
> accumulators and when they are merged, it triggers the merging of the whole
> chain to the metric cells. I know that Dataflow does not do the same, it
> uses non agregated metrics and sends them to an aggregation service. Will
> there be a change of paradigm with portability for runners that merge
> themselves ?
>
>
> There will be local aggregation of metrics scoped to a bundle; after the
> bundle is finished processing they are discarded. This will require some
> kind of global aggregation support from a runner, whether that runner does
> it via accumulators or via an aggregation service is up to the runner.
>
> 3 - Please confirm that the distinction between attempted and committed
> metrics is not the business of portable metrics. Indeed, it does not
> involve communication between the runner harness and the SDK harness as it
> is a runner only matter. I mean, when a runner commits a bundle it just
> updates its committed metrics and do not need to inform the SDK harness.
> But, of course, when the user requests committed metrics through the SDK,
> then the SDK harness will ask the runner harness to give them.
>
>
>
> You are correct in saying that during execution, the SDK does not
> differentiate between attempted and committed metrics and only the runner
> does. We still lack an API definition and contract for how an SDK would
> query for metrics from a runner but your right in saying that an SDK could
> request committed metrics and the Runner would supply them some how.
>
>
> Thanks
>
> Best
> Etienne
>
>
>
>

Re: [portablility] metrics interrogations

Reply via email to