[jira] [Commented] (IGNITE-17398) Opencensus metrics do not work well with Prometheus since it do not have tags

2022-08-03 Thread Andrey N. Gura (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574579#comment-17574579
 ] 

Andrey N. Gura commented on IGNITE-17398:
-

[~sergeykad] We already know that OpenCensus metrics exporter performs bad 
enough especially on deployments with a lot of caches. And of course it affects 
Ignite node performance and stability.
[~northdragon] As I remember you wanted to add a filter in order to limit 
amount of exported metrics, right? Plain text format is not friendly for such 
data sets. Labels\tags will make this data set bigger.

Also adding tags\labels can lead to breaking changes for already existing 
users' metrics exporters. 

All these issues are important enough and can not be ignored in favor of 
convenience of using tags in Prometheus.

> Opencensus metrics do not work well with Prometheus since it do not have tags
> -
>
> Key: IGNITE-17398
> URL: https://issues.apache.org/jira/browse/IGNITE-17398
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.9.1
>Reporter: Sergey Kadaner
>Priority: Major
>  Labels: metrics
> Attachments: image-2022-07-20-17-51-07-217.png
>
>
> The metrics created by the [new metrics 
> system|https://ignite.apache.org/docs/latest/monitoring-metrics/new-metrics] 
> are very inconvenient to use with Prometheus since it does not use tags.
> For example, Spring metric generated for the same cache looks like the 
> following in Prometheus and is very convenient to use:  
>  
> {noformat}
> cache_gets_total{cache="MY_CACHE_NAME",name="MY_CACHE_NAME",result="hit",} 
> 1387.0{noformat}
>  
> The native Ignite metric looks like the following: 
>  
> {noformat}
> cache_MY_CACHE_NAME_CacheHits 1387.0{noformat}
> The Spring reported statistics can be easily filtered by cache name and other 
> attributes, while the build-in Ignite metrics do not provide an easy way to 
> access cache names. The only option seems to be parsing the 
> "cache_MY_CACHE_NAME_CacheHits" strings which AFAIK is not supported by 
> Grafana.
>  
> For example with tags it is very easy to get a graph in Grafana with a cache 
> hit percentage that includes every existing cache. It automatically extracts 
> the cache name from the "cache" tag. Unfortunately, it is not possible if 
> there are no tags.
> !image-2022-07-20-17-51-07-217.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-17398) Opencensus metrics do not work well with Prometheus since it do not have tags

2022-08-02 Thread Sergey Kadaner (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574198#comment-17574198
 ] 

Sergey Kadaner commented on IGNITE-17398:
-

I see your point about it not being free, however, they say "Each labelset is 
an additional time series". As I understand it, that means that it is the same 
cost as creating the same number of time series. I.e. _cache_MY_CACHE_NAME1_ 
and _cache_MY_CACHE_NAME2_ metrics have the same cost as 
_cache\{name="MY_CACHE_NAME1"}_ and {_}cache\{name="MY_CACHE_NAME2"}{_}. So 
basically there is no additional cost in this case.

It is hard to say how it affects the GC pressure without actually measuring it, 
but AFAIK modern garbage collectors (G1 and newer) are very good with 
short-lived objects and should not have issues reclaiming them.

The issue is also not simply a convenience problem unless you can suggest how 
to draw diagrams without knowing cache names in advance.

> Opencensus metrics do not work well with Prometheus since it do not have tags
> -
>
> Key: IGNITE-17398
> URL: https://issues.apache.org/jira/browse/IGNITE-17398
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.9.1
>Reporter: Sergey Kadaner
>Priority: Major
>  Labels: metrics
> Attachments: image-2022-07-20-17-51-07-217.png
>
>
> The metrics created by the [new metrics 
> system|https://ignite.apache.org/docs/latest/monitoring-metrics/new-metrics] 
> are very inconvenient to use with Prometheus since it does not use tags.
> For example, Spring metric generated for the same cache looks like the 
> following in Prometheus and is very convenient to use:  
>  
> {noformat}
> cache_gets_total{cache="MY_CACHE_NAME",name="MY_CACHE_NAME",result="hit",} 
> 1387.0{noformat}
>  
> The native Ignite metric looks like the following: 
>  
> {noformat}
> cache_MY_CACHE_NAME_CacheHits 1387.0{noformat}
> The Spring reported statistics can be easily filtered by cache name and other 
> attributes, while the build-in Ignite metrics do not provide an easy way to 
> access cache names. The only option seems to be parsing the 
> "cache_MY_CACHE_NAME_CacheHits" strings which AFAIK is not supported by 
> Grafana.
>  
> For example with tags it is very easy to get a graph in Grafana with a cache 
> hit percentage that includes every existing cache. It automatically extracts 
> the cache name from the "cache" tag. Unfortunately, it is not possible if 
> there are no tags.
> !image-2022-07-20-17-51-07-217.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-17398) Opencensus metrics do not work well with Prometheus since it do not have tags

2022-08-02 Thread Andrey N. Gura (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574186#comment-17574186
 ] 

Andrey N. Gura commented on IGNITE-17398:
-

Tags/labels convenience is not free. Prometheus developers warn about it: [Do 
not overuse 
labels|https://prometheus.io/docs/practices/instrumentation/#do-not-overuse-labels].

Moreover, the OpenCensus exporter produces a lot of garbage due to a text 
nature of metrics format. Adding tags will significantly increase GC pressure 
and size of exported data. 

So, from my point of view, we should not add tags/labels. It looks like 
convenience vs Ignite stability and performance. 

> Opencensus metrics do not work well with Prometheus since it do not have tags
> -
>
> Key: IGNITE-17398
> URL: https://issues.apache.org/jira/browse/IGNITE-17398
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.9.1
>Reporter: Sergey Kadaner
>Priority: Major
>  Labels: metrics
> Attachments: image-2022-07-20-17-51-07-217.png
>
>
> The metrics created by the [new metrics 
> system|https://ignite.apache.org/docs/latest/monitoring-metrics/new-metrics] 
> are very inconvenient to use with Prometheus since it does not use tags.
> For example, Spring metric generated for the same cache looks like the 
> following in Prometheus and is very convenient to use:  
>  
> {noformat}
> cache_gets_total{cache="MY_CACHE_NAME",name="MY_CACHE_NAME",result="hit",} 
> 1387.0{noformat}
>  
> The native Ignite metric looks like the following: 
>  
> {noformat}
> cache_MY_CACHE_NAME_CacheHits 1387.0{noformat}
> The Spring reported statistics can be easily filtered by cache name and other 
> attributes, while the build-in Ignite metrics do not provide an easy way to 
> access cache names. The only option seems to be parsing the 
> "cache_MY_CACHE_NAME_CacheHits" strings which AFAIK is not supported by 
> Grafana.
>  
> For example with tags it is very easy to get a graph in Grafana with a cache 
> hit percentage that includes every existing cache. It automatically extracts 
> the cache name from the "cache" tag. Unfortunately, it is not possible if 
> there are no tags.
> !image-2022-07-20-17-51-07-217.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)