I see and don't know how to help you beyond what your already suggesting. >From what I remember, maps were added as syntactic sugar of lists of key value pairs.
On Tue, Oct 30, 2018 at 5:37 PM Alex Amato <ajam...@google.com> wrote: > I am not sure on the correct syntax to populate the instances of my > MonitoringInfoSpec messages > > message MonitoringInfoSpec { > > string urn = 1; > > string type_urn = 2; > > repeated string required_labels = 3; > > * map<string, string> annotations = 4;* > > } > > > Notice how the annotations field is not used anywhere. I was unable to get > this to compile and could find no examples of this on the proto github. > Perhaps I'll have to reach out to them. I was wondering if anyone here was > familiar first. > > > message MonitoringInfoSpecs { > > enum MonitoringInfoSpecsEnum { > > USER_COUNTER = 0 [(monitoring_info_spec) = { > > urn: "beam:metric:user", > > type_urn: "beam:metrics:sum_int_64", > > }]; > > > ELEMENT_COUNT = 1 [(monitoring_info_spec) = { > > urn: "beam:metric:element_count:v1", > > type_urn: "beam:metrics:sum_int_64", > > required_labels: ["PTRANSFORM"], > > }]; > > > START_BUNDLE_MSECS = 2 [(monitoring_info_spec) = { > > urn: "beam:metric:pardo_execution_time:start_bundle_msecs:v1", > > type_urn: "beam:metrics:sum_int_64", > > required_labels: ["PTRANSFORM"], > > }]; > > > PROCESS_BUNDLE_MSECS = 3 [(monitoring_info_spec) = { > > urn: "beam:metric:pardo_execution_time:process_bundle_msecs:v1", > > type_urn: "beam:metrics:sum_int_64", > > required_labels: ["PTRANSFORM"], > > }]; > > > FINISH_BUNDLE_MSECS = 4 [(monitoring_info_spec) = { > > urn: "beam:metric:pardo_execution_time:finish_bundle_msecs:v1", > > type_urn: "beam:metrics:sum_int_64", > > required_labels: ["PTRANSFORM"], > > }]; > > > TOTAL_MSECS = 5 [(monitoring_info_spec) = { > > urn: "beam:metric:ptransform_execution_time:total_msecs:v1", > > type_urn: "beam:metrics:sum_int_64", > > required_labels: ["PTRANSFORM"], > > }]; > > } > > } > > > > > On Tue, Oct 30, 2018 at 2:01 PM Lukasz Cwik <lc...@google.com> wrote: > >> I'm not sure what you mean by "Using a map in an option." >> >> For your second issue, the google docs around this show[1]: >> >> Note that if you want to use a custom option in a package other than the >> one in which it was defined, you must prefix the option name with the >> package name, just as you would for type names. For example: >> >> // foo.proto >> import "google/protobuf/descriptor.proto"; >> package foo; >> extend google.protobuf.MessageOptions { >> optional string my_option = 51234; >> } >> >> // bar.proto >> import "foo.proto"; >> package bar; >> message MyMessage { >> option (foo.my_option) = "Hello world!"; >> } >> >> >> 1: >> https://developers.google.com/protocol-buffers/docs/proto#customoptions >> >> >> On Mon, Oct 29, 2018 at 5:19 PM Alex Amato <ajam...@google.com> wrote: >> >>> Hi Robert and community, :) >>> >>> I was starting to code this up, but I wasn't sure exactly how to do some >>> of the proto syntax. Would you mind taking a look at this doc >>> <https://docs.google.com/document/d/1SB59MMVZXO0Aa6w0gf4m0qM4oYt4SiofDq3QxnpQaK4/edit?usp=sharing> >>> and let me know if you know how to resolve any of these issues: >>> >>> - Using a map in an option. >>> - Referring to string "constants" from other enum options in .proto >>> files. >>> >>> Please see the comments I have listed in the doc >>> <https://docs.google.com/document/d/1SB59MMVZXO0Aa6w0gf4m0qM4oYt4SiofDq3QxnpQaK4/edit?usp=sharing>, >>> and a few alternative suggestions. >>> Thanks >>> >>> On Wed, Oct 24, 2018 at 10:08 AM Alex Amato <ajam...@google.com> wrote: >>> >>>> Okay. That makes sense. Using runtime validation and protos is what I >>>> was thinking as well. >>>> I'll include you as a reviewer in my PRs. >>>> >>>> As for the choice of a builder/constructor/factory. If we go with >>>> factory methods/constructor then we will need a method for each metric type >>>> (intCounter, latestInt64, ...). Plus, then I don't think we can have any >>>> constructor parameters for labels, we will just need to accept a dictionary >>>> for label keys+values like you say. This is because we cannot offer a >>>> method for each URN and its combination of labels, and we should avoid such >>>> static detection, as you say. >>>> >>>> The builder however, allows us to add a method for setting each label. >>>> Since there are a small number of labels I think this should be fine. A >>>> specific metric URN will have a specific set of labels which we can >>>> validate being set. Any variant of this should use a different label (or a >>>> new version in the label). Thus, the builder can give an advantage, making >>>> it easier to set label values without needing to provide the correct key >>>> for the label, when its set. You just need to call the correct method. >>>> >>>> Perhaps it might be best to leave this open to each SDK based on its >>>> language style (Builder, Factory, etc.) , but make sure we use the >>>> protos+runtime validation. >>>> >>>> On Wed, Oct 24, 2018 at 7:02 AM Robert Bradshaw <rober...@google.com> >>>> wrote: >>>> >>>>> Thanks for bringing this to the list; it's a good question. >>>>> >>>>> I think the difficulty comes from trying to statically define a lists >>>>> of possibilities that should instead be runtime values. E.g. we >>>>> currently we're up to about a dozen distinct types, and having a >>>>> setter for each is both verbose and not very extensible (especially >>>>> replicated cross language). I'm not sure the set of possible labels is >>>>> fixed either. Generally in the FnAPI we've been using shared constants >>>>> for this kind of thing instead. (I was wary about the protos for the >>>>> same reasons; it would be good to avoid leaking this through to each >>>>> of the various SDKs.) The amount of static typing/validation one gets >>>>> depends on how much logic you build into each of these methods (e.g. >>>>> does it (almost?) always "metric" which is assumed to already be >>>>> encoded correctly, or a specific type that has tradeoffs with the >>>>> amount you can do generically (e.g. we have code that first loops over >>>>> counters, then over distributions, then over gauges, and I don't think >>>>> we want continue that pattern all M places for all N types)). >>>>> >>>>> I would probably lean towards mostly doing runtime validation here. >>>>> Specifically, one would have a data file that defines, for each known >>>>> URN, its type along with the set of permitted/expected/required >>>>> labels. On construction a MonitoringInfo data point, one would provide >>>>> a URN + value + labelMap, and optionally a type. On construction, the >>>>> constructor (method, factory, whatever) would look up the URN to >>>>> determine the type (or throw an error if it's both not known and not >>>>> provided), which could be then used to fetch an encoder of sorts (that >>>>> can go from value <-> proto encoding, possibly with some validation). >>>>> If labels and/or annotations are provided and the URN is known, we can >>>>> validate these as well. >>>>> >>>>> As for proto/enums vs. yaml, the former is nice because code >>>>> generation comes for free, but has turned out to be much more verbose >>>>> (both the definition and the use) and I'm still on the fence whether >>>>> it's a net win. >>>>> >>>>> (Unfortunately AutoValue won't help much here, as the ultimate goal is >>>>> to wrap a very nested proto OneOf, ideally with some validation. >>>>> (Also, in Python, generally, having optional, named arguments makes >>>>> this kind of builder pattern less needed.)) >>>>> >>>>> On Wed, Oct 24, 2018 at 4:12 AM Kenneth Knowles <k...@apache.org> >>>>> wrote: >>>>> > >>>>> > FWIW AutoValue will build most of that class for you, if it is as >>>>> you say. >>>>> > >>>>> > Kenn >>>>> > >>>>> > On Tue, Oct 23, 2018 at 6:04 PM Alex Amato <ajam...@google.com> >>>>> wrote: >>>>> >> >>>>> >> Hi Robert + beam dev list, >>>>> >> >>>>> >> I was thinking about your feedback in PR#6205, and agree that this >>>>> monitoring_infos.py became a bit big. >>>>> >> >>>>> >> I'm working on the Java Implementation of this now, and want to >>>>> incorporate some of these ideas and improve on this. >>>>> >> >>>>> >> I that that I should make something like a MonitoringInfoBuilder >>>>> class. With a few methods >>>>> >> >>>>> >> setUrn >>>>> >> setTimestamp >>>>> >> setting the value (One method for each Type we support. Setting >>>>> this will also set the type string) >>>>> >> >>>>> >> setInt64CounterValue >>>>> >> setDoubleCounterValue >>>>> >> setLatestInt64 >>>>> >> setTopNInt64 >>>>> >> setMonitoringDataTable >>>>> >> setDistributionInt64 >>>>> >> ... >>>>> >> >>>>> >> setting labels (will set the key and value properly for the label) >>>>> >> >>>>> >> setPTransform(value) >>>>> >> setPcollection(value) >>>>> >> ... >>>>> >> >>>>> >> >>>>> >> I think this will make building a metric much easier, you would >>>>> just call the 4 methods and the .build(). These builders are common in >>>>> Java. (I guess there is a similar thing way we could do in python? I'd >>>>> like >>>>> to go back and refactor that as well when I am done) >>>>> >> >>>>> >> ------------- >>>>> >> >>>>> >> As for your other suggestion to define metrics in the proto/enum >>>>> file instead of the yaml file. I am not too sure about the best strategy >>>>> for this. My initial thoughts are: >>>>> >> >>>>> >> Make a proto extension allowing you to describe/define a >>>>> MonitoringInfo's (the same info as the metric_definitions.yaml file): >>>>> >> >>>>> >> URN >>>>> >> Type >>>>> >> Labels required >>>>> >> Annotations: Description, Units, etc. >>>>> >> >>>>> >> Make the builder read in that MonitoringInfo definision/description >>>>> assert everything is set properly? I think this would be a decent data >>>>> driven approach. >>>>> >> >>>>> >> I was wondering if you had something else in mind? >>>>> >> >>>>> >> Thanks >>>>> >> Alex >>>>> >> >>>>> >> >>>>> >>>>