On Mon, Sep 10, 2018 at 11:07 AM Etienne Chauchot <[email protected]>
wrote:
> Hi all,
>
> @Luke, @Alex I have a general question related to metrics in the Fn API:
> as the communication between runner harness and SDK harness is done on a
> bundle basis. When the runner harness sends data to the sdk harness to
> execute a transform that contains metrics, does it:
>
> 1. send metrics values (for the ones defined in the transform)
> alongside with data and receive an updated value of the metrics from the
> sdk harness when the bundle is finished processing?
> 2. or does it send only the data and the sdk harness responds with a
> diff value of the metrics so that the runner can update them in its side?
>
> My bet is option 2. But can you confirm?
>
The runner harness periodically asks for the status of a bundle to which
the runner harness may respond with a current snapshot of metrics. These
metrics are deltas in the sense that only "dirty" metrics need to be
reported (i.e. unreported metrics can be assumed to have their previous
values) but are *not* deltas with respect to values, i.e. the full value is
reported each time. As an example, suppose one were counting red and blue
marbles. The first update may be something like
{ red: 5, blue: 7}
and if two more blue ones were found, a valid update would be
{ blue: 9 }
On bundle completion, the full set of metrics is reported as part of the
same message that declares the bundle complete.
On Tue, Sep 11, 2018 at 11:43 AM Etienne Chauchot <[email protected]>
wrote:
> Le lundi 10 septembre 2018 à 09:42 -0700, Lukasz Cwik a écrit :
>
> Alex is out on vacation for the next 3 weeks.
>
> Alex had proposed the types of metrics[1] but not the exact protocol as to
> what the SDK and runner do. I could envision Alex proposing that the SDK
> harness only sends diffs or dirty metrics in intermediate updates and all
> metrics values in the final update.
> Robert is referring to an integration that happened to an older set of
> messages[2] that preceeded Alex's proposal and that integration with
> Dataflow which is still incomplete works as you described in #2.
>
>
> Thanks Luke and Robert for the confirmation.
>
>
> Robin had recently been considering adding an accessor to DoFns that would
> allow you to get access to the job information from within the pipeline
> (current state, poll for metrics, invoke actions like cancel / drain, ...).
> He wanted it so he could poll for attempted metrics to be able to test
> @RequiresStableInput.
>
> Yes, I remember, I voted +1 to his proposal.
>
> Integrating the MetricsPusher or something like that on the SDK side to be
> able to poll metrics over the job information accessor could be useful.
>
>
> Well, in the design discussion, we decided to host Metrics Pusher as close
> as possible of the actual engine (inside the runner code chosen over the
> sdk code) to allow the runner to send system metrics in the future.
>
+1. The runner harness can then do whatever it wants (e.g. reporting back
to its master, or pushing to another service, or simply dropping them), but
the SDKs only have to follow the FnAPI contract.
>
> 1: https://s.apache.org/beam-fn-api-metrics
> 2:
> https://github.com/apache/beam/blob/9b68f926628d727e917b6a33ccdafcfe693eef6a/model/fn-execution/src/main/proto/beam_fn_api.proto#L410
>
>
> Besides, in his PR Alex talks about deprecated metrics. As he is off, can
> you tell me a little more about them ? What metrics will be deprecated when
> the portability framework is 100% operational on all the runners?
>
Currently, the SDKs return metrics to the FnAPI via the proto found at
https://github.com/apache/beam/blob/release-2.6.0/model/fn-execution/src/main/proto/beam_fn_api.proto#L410
(and specifically user metrics at
https://github.com/apache/beam/blob/release-2.6.0/model/fn-execution/src/main/proto/beam_fn_api.proto#L483
) The new metrics are the nested one-ofs defined at
https://github.com/apache/beam/blob/release-2.6.0/model/fn-execution/src/main/proto/beam_fn_api.proto#L269
>