The PCollection value comes from the key on the pipeline proto[1]. That key is populated during pipeline construction time[2] and is based upon the unique name of the PTransform + the name of the output being used (aka tag with .output being a default).
It looks like the counter PTRANFORM is coming from the metric step name[3]. I would take a look at the pipeline proto[4] that is generated during pipeline construction and the process bundle descriptors[5] during pipeline execution to see where something is being changed if at all. They should be able to have the same style in generated names but tracking down to where they are being changed is a good first step. 1: https://github.com/apache/beam/blob/957301519bb76a9647d026885fced1a775a7c9ff/model/pipeline/src/main/proto/org/apache/beam/model/pipeline/v1/beam_runner_api.proto#L68 2: https://github.com/apache/beam/blob/957301519bb76a9647d026885fced1a775a7c9ff/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PCollectionTranslation.java#L33 3: https://github.com/apache/beam/blob/434427e90b55027c5944fa73de68bff4f9a4e8fe/runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MetricsContainerImpl.java#L247 4: https://github.com/apache/beam/blob/434427e90b55027c5944fa73de68bff4f9a4e8fe/model/pipeline/src/main/proto/org/apache/beam/model/pipeline/v1/beam_runner_api.proto#L91 5: https://github.com/apache/beam/blob/434427e90b55027c5944fa73de68bff4f9a4e8fe/model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto#L189 On Wed, Jan 11, 2023 at 3:29 PM Katie Liu <[email protected]> wrote: > Attaching the monitoring_infos received, if helpful. > > I observed that the PCOLLECTION name format is the same in non-portable mode, > but the PTRANSFORM name has dashes instead. > > ``` > > monitoring_infos { > urn: "beam:metric:element_count:v1" > type: "beam:metrics:sum_int64:v1" > payload: "\000" > labels { > key: "PCOLLECTION" > value: "Kati-Step-2/ParMultiDo(Anonymous).output" > } > } > monitoring_infos { > urn: "beam:metric:user:sum_int64:v1" > type: "beam:metrics:sum_int64:v1" > payload: "\n" > labels { > key: "NAME" > value: "count101" > } > labels { > key: "NAMESPACE" > value: "org.apache.beam.runners.samza.portable.SamzaPortableTest" > } > labels { > key: "PTRANSFORM" > value: "Kati-Step-2-ParMultiDo-Anonymous-" > } > } > > ``` > > > On Wed, Jan 11, 2023 at 2:38 PM Katie Liu <[email protected]> wrote: > >> Hi beam-dev, >> >> I have a question regarding the PTransform name formatting. >> For the same user defined function, the naming is different using samza >> portable is "Kati-Step-2-ParMultiDo-Anonymous-", while in normal mode it >> is "Kati-Step-2/ParMultiDo(Anonymous)". >> >> Does this problem only exist in Samza? And are there pointers to where >> the PTransform name is generated? >> >> Thanks, >> Katie >> >
