Hi Robert + beam dev list,

I was thinking about your feedback in PR#6205
<https://github.com/apache/beam/pull/6205>, and agree that this
monitoring_infos.py
<monitoring_infos.pyhttps://github.com/apache/beam/blob/61a9f7193f1a61869915da3b4f386b34eac63822/sdks/python/apache_beam/metrics/monitoring_infos.py>
became a bit big.

I'm working on the Java Implementation of this now, and want to incorporate
some of these ideas and improve on this.

I that that I should make something like a MonitoringInfoBuilder class.
With a few methods

   - setUrn
   - setTimestamp
   - setting the value (One method for each Type we support
   
<https://github.com/apache/beam/blob/5e603ad4c642cfba0a6db70abd05ed8e9d89c7d6/model/fn-execution/src/main/proto/beam_fn_api.proto#L284>.
   Setting this will also set the type string)
      - setInt64CounterValue
      - setDoubleCounterValue
      - setLatestInt64
      - setTopNInt64
      - setMonitoringDataTable
      - setDistributionInt64
      - ...
   - setting labels (will set the key and value properly for the label)
      - setPTransform(value)
      - setPcollection(value)
      - ...


I think this will make building a metric much easier, you would just call
the 4 methods and the .build(). These builders are common in Java. (I guess
there is a similar thing way we could do in python? I'd like to go back and
refactor that as well when I am done)

-------------

As for your other suggestion to define metrics in the proto/enum file
instead of the yaml file. I am not too sure about the best strategy for
this. My initial thoughts are:

   1. Make a proto extension allowing you to describe/define a
   MonitoringInfo's (the same info as the metric_definitions.yaml
   
<https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/metric_definitions.yaml>
   file):
      1. URN
      2. Type
      3. Labels required
      4. Annotations: Description, Units, etc.
   2. Make the builder read in that MonitoringInfo definision/description
   assert everything is set properly? I think this would be a decent data
   driven approach.

I was wondering if you had something else in mind?

Thanks
Alex

Reply via email to