[ 
https://issues.apache.org/jira/browse/FLINK-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310618#comment-16310618
 ] 

Wei-Che Wei commented on FLINK-7935:
------------------------------------

Hi [~elevy]
What you described is almost correct. The FLINK-7692 provides users to expose 
their own variables to {{MetricGroup}}, but how to map the metric name and 
metric's variables to the third party metric system is the reporter's 
responsibility.
You can use {{MetricGroup#getAllVariables()}} to get {{type:messageType}} and 
other system scope variables. These can map to tags in DataDog reporter.
{{AbstractMetricGroup#getLogicalScope(CharacterFilter)}} can get 
{{<parent.logical.scope>.messages.type}} back, so use this function to expose 
metric name, which will be {{<parent.logical.scope>.messages.type.counts}}. For 
example, Prometheus reporter use it to expose metric name. 
[[1|https://github.com/apache/flink/blob/beb11976fe63c20a5dc9f22ea713c05b4d5e9585/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporter.java#L217]]
However, {{MetricGroup#getMetricIdentifier(String)}} will still return 
{{<parent.identifier>.messages.type.<messageType>}}. It seems that DataDog 
reporter used this function to get metric name. 
[[2|https://github.com/apache/flink/blob/master/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L63]]
I think that is the limitation in DataDog reporter, maybe we can make 
{{AbstractMetricGroup#getLogicalScope(CharacterFilter)}} as a public API, and 
update DataDog reporter.

cc [~Zentol]
Do you have any suggestions and comments? If I make any mistake on my comment, 
please correct me. Thank you.

> Metrics with user supplied scope variables
> ------------------------------------------
>
>                 Key: FLINK-7935
>                 URL: https://issues.apache.org/jira/browse/FLINK-7935
>             Project: Flink
>          Issue Type: Improvement
>          Components: Metrics
>    Affects Versions: 1.3.2
>            Reporter: Elias Levy
>
> We use DataDog for metrics.  DD and Flink differ somewhat in how they track 
> metrics.
> Flink names and scopes metrics together, at least by default. E.g. by default 
>  the System scope for operator metrics is 
> {{<host>.taskmanager.<tm_id>.<job_name>.<operator_name>.<subtask_index>}}.  
> The scope variables become part of the metric's full name.
> In DD the metric would be named something generic, e.g. 
> {{taskmanager.job.operator}}, and they would be distinguished by their tag 
> values, e.g. {{tm_id=foo}}, {{job_name=var}}, {{operator_name=baz}}.
> Flink allows you to configure the format string for system scopes, so it is 
> possible to set the operator scope format to {{taskmanager.job.operator}}.  
> We do this for all scopes:
> {code}
> metrics.scope.jm: jobmanager
> metrics.scope.jm.job: jobmanager.job
> metrics.scope.tm: taskmanager
> metrics.scope.tm.job: taskmanager.job
> metrics.scope.task: taskmanager.job.task
> metrics.scope.operator: taskmanager.job.operator
> {code}
> This seems to work.  The DataDog Flink metric's plugin submits all scope 
> variables as tags, even if they are not used within the scope format.  And it 
> appears internally this does not lead to metrics conflicting with each other.
> We would like to extend this to user defined metrics, but you can define 
> variables/scopes when adding a metric group or metric with the user API, so 
> that in DD we have a single metric with a tag with many different values, 
> rather than hundreds of metrics to just the one value we want to measure 
> across different event types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to