[ 
https://issues.apache.org/jira/browse/FLINK-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278826#comment-17278826
 ] 

jiguodai commented on FLINK-11742:
----------------------------------

in fact, it has nothing to do with "instance", the real reason why metrics in 
pushgateway  will disappear some times is that 

> Push metrics to Pushgateway without "instance"
> ----------------------------------------------
>
>                 Key: FLINK-11742
>                 URL: https://issues.apache.org/jira/browse/FLINK-11742
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Metrics
>            Reporter: Tom Goong
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2019-02-25-17-16-28-618.png, 
> image-2019-02-25-17-16-59-034.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> According to the official article,
> [https://prometheus.io/docs/concepts/jobs_instances/]
> [https://github.com/prometheus/pushgateway]
> when sending a metric to Prometheus Pushgateway, you need to give an 
> "instance" message.
>  In actual use, after there is no "instance", Prometheus stores metrics with 
> problems, metrics are not continuous, and a lot of data is lost. After adding 
> instance, it returns to normal.
>  
> no "instance" 
> !image-2019-02-25-17-16-28-618.png!
>  
> with "instance"
> !image-2019-02-25-17-16-59-034.png!
>  
>  
> {quote}In Prometheus terms, an endpoint you can scrape is called an instance, 
> usually corresponding to a single process. A collection of instances with the 
> same purpose, a process replicated for scalability or reliability for 
> example, is called a job.
> {quote}
> {quote}For example, an API server job with four replicated instances:
> job: api-server
> -- instance 1: 1.2.3.4:5670
> -- instance 2: 1.2.3.4:5671
> -- instance 3: 5.6.7.8:5670
> -- instance 4: 5.6.7.8:5671
> {quote}
> [https://prometheus.io/docs/concepts/jobs_instances/#jobs-and-instances]
> I think a Flink job corresponds to a Prometheus job, and taskmanager and 
> jobmanager correspond to different instances. If the jobName is used as the 
> instance label, the same metrics of different tasksmanages will conflict, and 
> operations such as sum will fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to