The way I entered them into the Go SDK is #2 (SDK sends diffs per bundle) and the Java Runner Harness appears to aggregate them correctly from there.
On Mon, Sep 10, 2018, 2:07 AM Etienne Chauchot <[email protected]> wrote: > Hi all, > > @Luke, @Alex I have a general question related to metrics in the Fn API: > as the communication between runner harness and SDK harness is done on a > bundle basis. When the runner harness sends data to the sdk harness to > execute a transform that contains metrics, does it: > > 1. send metrics values (for the ones defined in the transform) > alongside with data and receive an updated value of the metrics from the > sdk harness when the bundle is finished processing? > 2. or does it send only the data and the sdk harness responds with a > diff value of the metrics so that the runner can update them in its side? > > My bet is option 2. But can you confirm? > > Thanks > > Etienne > > Le jeudi 19 juillet 2018 à 15:10 +0200, Etienne Chauchot a écrit : > > Thanks for the confirmations Luke. > > Le mercredi 18 juillet 2018 à 07:56 -0700, Lukasz Cwik a écrit : > > > > On Wed, Jul 18, 2018 at 7:01 AM Etienne Chauchot <[email protected]> > wrote: > > Hi, > Luke, Alex, I have some portable metrics interrogations, can you confirm > them ? > > 1 - As it is the SDK harness that will run the code of the UDFs, if a UDF > defines a metric, then the SDK harness will give updates through GRPC calls > to the runner so that the runner could update metrics cells, right? > > > Yes. > > > > 2 - Alex, you mentioned in proto and design doc that there will be no > aggreagation of metrics. But some runners (spark/flink) rely on > accumulators and when they are merged, it triggers the merging of the whole > chain to the metric cells. I know that Dataflow does not do the same, it > uses non agregated metrics and sends them to an aggregation service. Will > there be a change of paradigm with portability for runners that merge > themselves ? > > > There will be local aggregation of metrics scoped to a bundle; after the > bundle is finished processing they are discarded. This will require some > kind of global aggregation support from a runner, whether that runner does > it via accumulators or via an aggregation service is up to the runner. > > 3 - Please confirm that the distinction between attempted and committed > metrics is not the business of portable metrics. Indeed, it does not > involve communication between the runner harness and the SDK harness as it > is a runner only matter. I mean, when a runner commits a bundle it just > updates its committed metrics and do not need to inform the SDK harness. > But, of course, when the user requests committed metrics through the SDK, > then the SDK harness will ask the runner harness to give them. > > > > You are correct in saying that during execution, the SDK does not > differentiate between attempted and committed metrics and only the runner > does. We still lack an API definition and contract for how an SDK would > query for metrics from a runner but your right in saying that an SDK could > request committed metrics and the Runner would supply them some how. > > > Thanks > > Best > Etienne > > > >
