Also, UseStringDuplication can’t be enabled from code. @Yana I have a few PRs that I need find time to review before the next release.
I’m targeting a couple of weeks. On Thursday, January 23, 2025 at 7:39:04 AM UTC-5 Яна Илиева wrote: > Hi, > > Our intention was to minimize the scope of the desired effect to only > those strings which are used in the caching mechanism of the JMX exporter. > UseStringDeduplication is a global setting that affects the main > application as well and this is undesirable in our case. > > If there are other solutions to the problem at hand, I'll be glad to > explore them, of course. > > P.S. > Doug, when do you think this change will be available in a release, be it > minor or major? > > Kind regards, > Yana > > > > > > On Friday, December 20, 2024 at 1:57:27 AM UTC+2 Francisco Melo junior > wrote: > >> Sure thank you. >> >> Can you be more specific on the consequences? >> >> Evidently there will be an overhead for the comparison for each string, >> if that’s what you mean. But if the focus is less footprint. And should be >> (almost) similar to intern(), because the options are done the same number >> of times. >> >> Also it should only remove the duplicates but keep the strings all in the >> pool. >> >> Unless the strings are used for as the actual keys in the caching so >> removing the duplicates will indeed be problematic because you will lose >> access to the value associated with it. >> >> Sent from Earth >> >> On Dec 19, 2024, at 3:19 PM, Doug Hoard <[email protected]> wrote: >> >> Francisco, >> >> >> Using "XX:+UseStringDeduplication" to enable String deduplication for >> the whole JVM process could have unwanted side effects, so performing the >> "intern()" of the cache keys directly was deemed a much safer option. >> >> On Wednesday, December 18, 2024 at 7:30:48 AM UTC-5 Francisco Melo junior >> wrote: >> >>> Hello I was wondering here what is the result of you disable string >>> deduplication XX:+UseStringDeduplication, or was that the caching you >>> did? It should reduce the footprint considerably if the duplication is the >>> problem, not increase it. >>> >>> -Francisco >>> >>> On Dec 13, 2024, at 9:54 AM, Яна Илиева <[email protected]> wrote: >>> >>> Hello Doug, >>> >>> >>> Thank you for the quick reaction and the testing. I will proceed with >>> opening the PR. >>> >>> Kind regards, >>> Yana >>> On Friday, December 13, 2024 at 3:44:59 PM UTC+2 Doug Hoard wrote: >>> >>>> Yana, >>>> >>>> The full integration test passed without issues. Can you create a PR of >>>> your findings and the change so we can document it and give credit? >>>> >>>> -Doug >>>> >>>> On Friday, December 13, 2024 at 1:44:47 AM UTC-5 Doug Hoard wrote: >>>> >>>>> Using String intern() can help in some scenarios and cause performance >>>>> issues in other scenarios. >>>>> >>>>> I made your change on my development branch and ran a quick test >>>>> (./quick-test.sh) and didn't see any unit test or integration test issues. >>>>> >>>>> I just started a full regression test and will have some results in >>>>> the morning (US time.) (It takes a little over 3 hours.) >>>>> >>>>> -Doug >>>>> >>>>> On Thursday, December 12, 2024 at 11:02:02 AM UTC-5 Яна Илиева wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> While using the Prometheus JMX Exporter as a Java agent in our >>>>>> application, we observed frequent GC clean ups and it turns out it is >>>>>> because the available heap space got exhausted by the caching of the >>>>>> Prometheus JMX Exporter. >>>>>> >>>>>> For context - we have an exporter yaml file with about 40 rules >>>>>> defined and there are many JMX MBeans which we process in order to get >>>>>> the >>>>>> required metrics. Without caching enabled for the rules, the request can >>>>>> take a minimum of 1 minute to complete. In an attempt to reduce this >>>>>> time, >>>>>> we enabled caching and observed that the MatchedRulesCache object can >>>>>> take >>>>>> around 600 MB in a particular case. There is a realistic potential in >>>>>> out >>>>>> case that this object grows above 1 GB, which is a huge amount of space >>>>>> for >>>>>> this kind of process. >>>>>> >>>>>> We identified that the major reason for the large size of >>>>>> MatchedRulesCache is that it contains duplicating String objects for the >>>>>> same MBean names. The object keeps all MBean names the rules were >>>>>> matched >>>>>> against previously, in order to avoid expensive pattern matching later. >>>>>> Although each rule caches the same set of MBean names, those are in fact >>>>>> separate objects in the heap. >>>>>> >>>>>> The origin of the matter was found in >>>>>> https://github.com/prometheus/jmx_exporter/blob/a3dac9acee1464531cd87502579178a1fec1cc76/collector/src/main/java/io/prometheus/jmx/JmxCollector.java#L584 >>>>>> >>>>>> where for each rule with caching enabled, there is a new String >>>>>> object created, although such could already exist in memory from a >>>>>> previous >>>>>> iteration. >>>>>> >>>>>> The duplication is by a factor of the number of rules with caching >>>>>> enabled. >>>>>> >>>>>> I experimented with interning the String, >>>>>> >>>>>> *String matchName = (beanName + attributeName + ": " + >>>>>> matchBeanValue).intern();* >>>>>> >>>>>> so that the JVM's string pool is utilized and the String objects >>>>>> reused. The result in speed and heap space used by cache is described >>>>>> below: >>>>>> >>>>>> [image: Screenshot from 2024-12-12 14-53-26.png] >>>>>> >>>>>> What do you think about this and do you find this suggestion could >>>>>> have negative consequences in certain cases? >>>>>> >>>>>> Kind regards, >>>>>> Yana Ilieva >>>>>> >>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Prometheus Developers" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion visit >>> https://groups.google.com/d/msgid/prometheus-developers/6c2d05ad-6674-424e-afe3-3289b370b286n%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/prometheus-developers/6c2d05ad-6674-424e-afe3-3289b370b286n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Prometheus Developers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> >> To view this discussion visit >> https://groups.google.com/d/msgid/prometheus-developers/b8828a92-6422-4fb6-9103-e038e1c4b191n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/prometheus-developers/b8828a92-6422-4fb6-9103-e038e1c4b191n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/prometheus-developers/e6e07d90-cdc8-433f-89df-db0e92b9da50n%40googlegroups.com.

