Francisco, Using "XX:+UseStringDeduplication" to enable String deduplication for the whole JVM process could have unwanted side effects, so performing the "intern()" of the cache keys directly was deemed a much safer option.
On Wednesday, December 18, 2024 at 7:30:48 AM UTC-5 Francisco Melo junior wrote: > Hello I was wondering here what is the result of you disable string > deduplication XX:+UseStringDeduplication, or was that the caching you > did? It should reduce the footprint considerably if the duplication is the > problem, not increase it. > > -Francisco > > On Dec 13, 2024, at 9:54 AM, Яна Илиева <[email protected]> wrote: > > Hello Doug, > > > Thank you for the quick reaction and the testing. I will proceed with > opening the PR. > > Kind regards, > Yana > On Friday, December 13, 2024 at 3:44:59 PM UTC+2 Doug Hoard wrote: > >> Yana, >> >> The full integration test passed without issues. Can you create a PR of >> your findings and the change so we can document it and give credit? >> >> -Doug >> >> On Friday, December 13, 2024 at 1:44:47 AM UTC-5 Doug Hoard wrote: >> >>> Using String intern() can help in some scenarios and cause performance >>> issues in other scenarios. >>> >>> I made your change on my development branch and ran a quick test >>> (./quick-test.sh) and didn't see any unit test or integration test issues. >>> >>> I just started a full regression test and will have some results in the >>> morning (US time.) (It takes a little over 3 hours.) >>> >>> -Doug >>> >>> On Thursday, December 12, 2024 at 11:02:02 AM UTC-5 Яна Илиева wrote: >>> >>>> Hello, >>>> >>>> While using the Prometheus JMX Exporter as a Java agent in our >>>> application, we observed frequent GC clean ups and it turns out it is >>>> because the available heap space got exhausted by the caching of the >>>> Prometheus JMX Exporter. >>>> >>>> For context - we have an exporter yaml file with about 40 rules defined >>>> and there are many JMX MBeans which we process in order to get the >>>> required >>>> metrics. Without caching enabled for the rules, the request can take a >>>> minimum of 1 minute to complete. In an attempt to reduce this time, we >>>> enabled caching and observed that the MatchedRulesCache object can take >>>> around 600 MB in a particular case. There is a realistic potential in out >>>> case that this object grows above 1 GB, which is a huge amount of space >>>> for >>>> this kind of process. >>>> >>>> We identified that the major reason for the large size of >>>> MatchedRulesCache is that it contains duplicating String objects for the >>>> same MBean names. The object keeps all MBean names the rules were matched >>>> against previously, in order to avoid expensive pattern matching later. >>>> Although each rule caches the same set of MBean names, those are in fact >>>> separate objects in the heap. >>>> >>>> The origin of the matter was found in >>>> https://github.com/prometheus/jmx_exporter/blob/a3dac9acee1464531cd87502579178a1fec1cc76/collector/src/main/java/io/prometheus/jmx/JmxCollector.java#L584 >>>> >>>> where for each rule with caching enabled, there is a new String object >>>> created, although such could already exist in memory from a previous >>>> iteration. >>>> >>>> The duplication is by a factor of the number of rules with caching >>>> enabled. >>>> >>>> I experimented with interning the String, >>>> >>>> *String matchName = (beanName + attributeName + ": " + >>>> matchBeanValue).intern();* >>>> >>>> so that the JVM's string pool is utilized and the String objects >>>> reused. The result in speed and heap space used by cache is described >>>> below: >>>> >>>> [image: Screenshot from 2024-12-12 14-53-26.png] >>>> >>>> What do you think about this and do you find this suggestion could have >>>> negative consequences in certain cases? >>>> >>>> Kind regards, >>>> Yana Ilieva >>>> >>> -- > You received this message because you are subscribed to the Google Groups > "Prometheus Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/prometheus-developers/6c2d05ad-6674-424e-afe3-3289b370b286n%40googlegroups.com > > <https://groups.google.com/d/msgid/prometheus-developers/6c2d05ad-6674-424e-afe3-3289b370b286n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/prometheus-developers/b8828a92-6422-4fb6-9103-e038e1c4b191n%40googlegroups.com.

