[ 
https://issues.apache.org/jira/browse/FLINK-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151346#comment-15151346
 ] 

Dongwon Kim commented on FLINK-1502:
------------------------------------

To [~StephanEwen], [~mxm], [~jgrier], 
First of all, sorry for the late response.

We just need to make each TaskManager report its metrics to 
JMX/Ganglia/Graphite as you guys suggested.

To [~mxm], 
the problem mainly comes from such a design is that a newly launched 
TaskManager is given a randomly generated UUID and it will create too many 
Ganglia metrics as [~jgrier] mentioned above.
I think [~jgrier]'s solution is quite simple yet viable:

cluster.<CLUSTER_NAME>.taskmanager.1.gc_time
cluster.<CLUSTER_NAME>.taskmanager.2.gc_time

To that end, we need to open a new issue to assign such IDs to TaskManagers 
running on the same host.
One concern is that. despite only one TaskManager running each node, we need to 
do such numbering (e.g. <CLUSTER_NAME>.taskmanager.1.gc_time).
I'm okay with it but users could think that the numbering is quite ugly.

How do you guys think?

> Expose metrics to graphite, ganglia and JMX.
> --------------------------------------------
>
>                 Key: FLINK-1502
>                 URL: https://issues.apache.org/jira/browse/FLINK-1502
>             Project: Flink
>          Issue Type: Sub-task
>          Components: JobManager, TaskManager
>    Affects Versions: 0.9
>            Reporter: Robert Metzger
>            Assignee: Dongwon Kim
>            Priority: Minor
>             Fix For: pre-apache
>
>
> The metrics library allows to expose collected metrics easily to other 
> systems such as graphite, ganglia or Java's JVM (VisualVM).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to