[
https://issues.apache.org/jira/browse/FLINK-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Dail updated FLINK-6911:
------------------------------
Description:
The StatsDReporter does not escape spaces in the metric name. It is generally
accepted that spaces in the metric name are a bad idea:
https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name
It should also be noted that the FlinkStatsDReporter was based on the ReadyTalk
StatsD implementation (this is indicated in the comment). Note that the
ReadyTalk implementation does replace whitespace:
https://github.com/ReadyTalk/metrics-statsd/blob/master/metrics-statsd-common/src/main/java/com/readytalk/metrics/StatsD.java#L129
Specifically, I am integrating with Telegraf. It actually splits the name on
spaces and treats these as (name, value, timestamp). It ignores everything
except the name.
https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225
Initially I found this issue when I had a space in the job name. Flink encodes
the job name into the metrics as is. So when I put these into telegraf, all of
the job level metrics ended up with the same bucket in telegraf.
Flink also uses things like "Sink- <name>" and "Source- <name>" to encode
source/sink. These also do not work with telegraf. I end up with metrics that
look like this inside telegraf:
{noformat}
taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
{noformat}
The actual name is truncated after the space.
was:
The StatsDReporter does not escape spaces in the metric name. It is generally
accepted that spaces in the metric name are a bad idea:
https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name
Specifically, I am integrating with Telegraf. It actually splits the name on
spaces and treats these as (name, value, timestamp). It ignores everything
except the name.
https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225
Initially I found this issue when I had a space in the job name. Flink encodes
the job name into the metrics as is. So when I put these into telegraf, all of
the job level metrics ended up with the same bucket in telegraf.
Flink also uses things like "Sink- <name>" and "Source- <name>" to encode
source/sink. These also do not work with telegraf. I end up with metrics that
look like this inside telegraf:
{noformat}
taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
{noformat}
The actual name is truncated after the space.
> StatsD Metrics name should escape spaces
> -----------------------------------------
>
> Key: FLINK-6911
> URL: https://issues.apache.org/jira/browse/FLINK-6911
> Project: Flink
> Issue Type: Improvement
> Components: Metrics
> Affects Versions: 1.3.0
> Environment: StatsD Metrics with Telegraf server
> Reporter: Chris Dail
>
> The StatsDReporter does not escape spaces in the metric name. It is generally
> accepted that spaces in the metric name are a bad idea:
> https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name
> It should also be noted that the FlinkStatsDReporter was based on the
> ReadyTalk StatsD implementation (this is indicated in the comment). Note that
> the ReadyTalk implementation does replace whitespace:
> https://github.com/ReadyTalk/metrics-statsd/blob/master/metrics-statsd-common/src/main/java/com/readytalk/metrics/StatsD.java#L129
> Specifically, I am integrating with Telegraf. It actually splits the name on
> spaces and treats these as (name, value, timestamp). It ignores everything
> except the name.
> https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225
> Initially I found this issue when I had a space in the job name. Flink
> encodes the job name into the metrics as is. So when I put these into
> telegraf, all of the job level metrics ended up with the same bucket in
> telegraf.
> Flink also uses things like "Sink- <name>" and "Source- <name>" to encode
> source/sink. These also do not work with telegraf. I end up with metrics that
> look like this inside telegraf:
> {noformat}
> taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
> {noformat}
> The actual name is truncated after the space.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)