[jira] [Commented] (IGNITE-17398) Opencensus metrics do not work well with Prometheus since it do not have tags
[ https://issues.apache.org/jira/browse/IGNITE-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574579#comment-17574579 ] Andrey N. Gura commented on IGNITE-17398: - [~sergeykad] We already know that OpenCensus metrics exporter performs bad enough especially on deployments with a lot of caches. And of course it affects Ignite node performance and stability. [~northdragon] As I remember you wanted to add a filter in order to limit amount of exported metrics, right? Plain text format is not friendly for such data sets. Labels\tags will make this data set bigger. Also adding tags\labels can lead to breaking changes for already existing users' metrics exporters. All these issues are important enough and can not be ignored in favor of convenience of using tags in Prometheus. > Opencensus metrics do not work well with Prometheus since it do not have tags > - > > Key: IGNITE-17398 > URL: https://issues.apache.org/jira/browse/IGNITE-17398 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.9.1 >Reporter: Sergey Kadaner >Priority: Major > Labels: metrics > Attachments: image-2022-07-20-17-51-07-217.png > > > The metrics created by the [new metrics > system|https://ignite.apache.org/docs/latest/monitoring-metrics/new-metrics] > are very inconvenient to use with Prometheus since it does not use tags. > For example, Spring metric generated for the same cache looks like the > following in Prometheus and is very convenient to use: > > {noformat} > cache_gets_total{cache="MY_CACHE_NAME",name="MY_CACHE_NAME",result="hit",} > 1387.0{noformat} > > The native Ignite metric looks like the following: > > {noformat} > cache_MY_CACHE_NAME_CacheHits 1387.0{noformat} > The Spring reported statistics can be easily filtered by cache name and other > attributes, while the build-in Ignite metrics do not provide an easy way to > access cache names. The only option seems to be parsing the > "cache_MY_CACHE_NAME_CacheHits" strings which AFAIK is not supported by > Grafana. > > For example with tags it is very easy to get a graph in Grafana with a cache > hit percentage that includes every existing cache. It automatically extracts > the cache name from the "cache" tag. Unfortunately, it is not possible if > there are no tags. > !image-2022-07-20-17-51-07-217.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17398) Opencensus metrics do not work well with Prometheus since it do not have tags
[ https://issues.apache.org/jira/browse/IGNITE-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574198#comment-17574198 ] Sergey Kadaner commented on IGNITE-17398: - I see your point about it not being free, however, they say "Each labelset is an additional time series". As I understand it, that means that it is the same cost as creating the same number of time series. I.e. _cache_MY_CACHE_NAME1_ and _cache_MY_CACHE_NAME2_ metrics have the same cost as _cache\{name="MY_CACHE_NAME1"}_ and {_}cache\{name="MY_CACHE_NAME2"}{_}. So basically there is no additional cost in this case. It is hard to say how it affects the GC pressure without actually measuring it, but AFAIK modern garbage collectors (G1 and newer) are very good with short-lived objects and should not have issues reclaiming them. The issue is also not simply a convenience problem unless you can suggest how to draw diagrams without knowing cache names in advance. > Opencensus metrics do not work well with Prometheus since it do not have tags > - > > Key: IGNITE-17398 > URL: https://issues.apache.org/jira/browse/IGNITE-17398 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.9.1 >Reporter: Sergey Kadaner >Priority: Major > Labels: metrics > Attachments: image-2022-07-20-17-51-07-217.png > > > The metrics created by the [new metrics > system|https://ignite.apache.org/docs/latest/monitoring-metrics/new-metrics] > are very inconvenient to use with Prometheus since it does not use tags. > For example, Spring metric generated for the same cache looks like the > following in Prometheus and is very convenient to use: > > {noformat} > cache_gets_total{cache="MY_CACHE_NAME",name="MY_CACHE_NAME",result="hit",} > 1387.0{noformat} > > The native Ignite metric looks like the following: > > {noformat} > cache_MY_CACHE_NAME_CacheHits 1387.0{noformat} > The Spring reported statistics can be easily filtered by cache name and other > attributes, while the build-in Ignite metrics do not provide an easy way to > access cache names. The only option seems to be parsing the > "cache_MY_CACHE_NAME_CacheHits" strings which AFAIK is not supported by > Grafana. > > For example with tags it is very easy to get a graph in Grafana with a cache > hit percentage that includes every existing cache. It automatically extracts > the cache name from the "cache" tag. Unfortunately, it is not possible if > there are no tags. > !image-2022-07-20-17-51-07-217.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17398) Opencensus metrics do not work well with Prometheus since it do not have tags
[ https://issues.apache.org/jira/browse/IGNITE-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574186#comment-17574186 ] Andrey N. Gura commented on IGNITE-17398: - Tags/labels convenience is not free. Prometheus developers warn about it: [Do not overuse labels|https://prometheus.io/docs/practices/instrumentation/#do-not-overuse-labels]. Moreover, the OpenCensus exporter produces a lot of garbage due to a text nature of metrics format. Adding tags will significantly increase GC pressure and size of exported data. So, from my point of view, we should not add tags/labels. It looks like convenience vs Ignite stability and performance. > Opencensus metrics do not work well with Prometheus since it do not have tags > - > > Key: IGNITE-17398 > URL: https://issues.apache.org/jira/browse/IGNITE-17398 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.9.1 >Reporter: Sergey Kadaner >Priority: Major > Labels: metrics > Attachments: image-2022-07-20-17-51-07-217.png > > > The metrics created by the [new metrics > system|https://ignite.apache.org/docs/latest/monitoring-metrics/new-metrics] > are very inconvenient to use with Prometheus since it does not use tags. > For example, Spring metric generated for the same cache looks like the > following in Prometheus and is very convenient to use: > > {noformat} > cache_gets_total{cache="MY_CACHE_NAME",name="MY_CACHE_NAME",result="hit",} > 1387.0{noformat} > > The native Ignite metric looks like the following: > > {noformat} > cache_MY_CACHE_NAME_CacheHits 1387.0{noformat} > The Spring reported statistics can be easily filtered by cache name and other > attributes, while the build-in Ignite metrics do not provide an easy way to > access cache names. The only option seems to be parsing the > "cache_MY_CACHE_NAME_CacheHits" strings which AFAIK is not supported by > Grafana. > > For example with tags it is very easy to get a graph in Grafana with a cache > hit percentage that includes every existing cache. It automatically extracts > the cache name from the "cache" tag. Unfortunately, it is not possible if > there are no tags. > !image-2022-07-20-17-51-07-217.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)