Also, UseStringDuplication can’t be enabled from code.

@Yana I have a few PRs that I need find time to review before the next 
release.

I’m targeting a couple of weeks.

On Thursday, January 23, 2025 at 7:39:04 AM UTC-5 Яна Илиева wrote:

> Hi,
>
> Our intention was to minimize the scope of the desired effect to only 
> those strings which are used in the caching mechanism of the JMX exporter. 
> UseStringDeduplication is a global setting that affects the main 
> application as well and this is undesirable in our case.
>
> If there are other solutions to the problem at hand, I'll be glad to 
> explore them, of course.
>
> P.S.
> Doug, when do you think this change will be available in a release, be it 
> minor or major?
>
> Kind regards,
> Yana
>
>
>
>
>
> On Friday, December 20, 2024 at 1:57:27 AM UTC+2 Francisco Melo junior 
> wrote:
>
>> Sure thank you.
>>
>> Can you be more specific on the consequences?
>>
>> Evidently there will be an overhead for the comparison for each string, 
>> if that’s what you mean. But if the focus is less footprint. And should be 
>> (almost) similar to intern(), because the options are done the same number 
>> of times.
>>
>> Also it should only remove the duplicates but keep the strings all in the 
>> pool.
>>
>> Unless the strings are used for as the actual keys in the caching so 
>> removing the duplicates will indeed be problematic because you will lose 
>> access to the value associated with it.
>>
>> Sent from Earth
>>
>> On Dec 19, 2024, at 3:19 PM, Doug Hoard <[email protected]> wrote:
>>
>> Francisco,
>>
>>
>> Using "XX:+UseStringDeduplication" to enable String deduplication for 
>> the whole JVM process could have unwanted side effects, so performing the 
>> "intern()" of the cache keys directly was deemed a much safer option.
>>
>> On Wednesday, December 18, 2024 at 7:30:48 AM UTC-5 Francisco Melo junior 
>> wrote:
>>
>>> Hello I was wondering here what is the result of you disable string 
>>> deduplication XX:+UseStringDeduplication, or was that the caching you 
>>> did? It should reduce the footprint considerably if the duplication is the 
>>> problem, not increase it. 
>>>
>>> -Francisco
>>>
>>> On Dec 13, 2024, at 9:54 AM, Яна Илиева <[email protected]> wrote:
>>>
>>> Hello Doug,
>>>
>>>
>>> Thank you for the quick reaction and the testing. I will proceed with 
>>> opening the PR.
>>>
>>> Kind regards,
>>> Yana
>>> On Friday, December 13, 2024 at 3:44:59 PM UTC+2 Doug Hoard wrote:
>>>
>>>> Yana,
>>>>
>>>> The full integration test passed without issues. Can you create a PR of 
>>>> your findings and the change so we can document it and give credit?
>>>>
>>>> -Doug
>>>>
>>>> On Friday, December 13, 2024 at 1:44:47 AM UTC-5 Doug Hoard wrote:
>>>>
>>>>> Using String intern() can help in some scenarios and cause performance 
>>>>> issues in other scenarios.
>>>>>
>>>>> I made your change on my development branch and ran a quick test 
>>>>> (./quick-test.sh) and didn't see any unit test or integration test issues.
>>>>>
>>>>> I just started a full regression test and will have some results in 
>>>>> the morning (US time.) (It takes a little over 3 hours.)
>>>>>
>>>>> -Doug
>>>>>
>>>>> On Thursday, December 12, 2024 at 11:02:02 AM UTC-5 Яна Илиева wrote:
>>>>>
>>>>>> Hello, 
>>>>>>
>>>>>> While using the Prometheus JMX Exporter as a Java agent in our 
>>>>>> application, we observed frequent GC clean ups and it turns out it is 
>>>>>> because the available heap space got exhausted by the caching of the 
>>>>>> Prometheus JMX Exporter.
>>>>>>
>>>>>> For context - we have an exporter yaml file with about 40 rules 
>>>>>> defined and there are many JMX MBeans which we process in order to get 
>>>>>> the 
>>>>>> required metrics. Without caching enabled for the rules, the request can 
>>>>>> take a minimum of 1 minute to complete. In an attempt to reduce this 
>>>>>> time, 
>>>>>> we enabled caching and observed that the MatchedRulesCache object can 
>>>>>> take 
>>>>>> around 600 MB in a particular case. There is a realistic potential in 
>>>>>> out 
>>>>>> case that this object grows above 1 GB, which is a huge amount of space 
>>>>>> for 
>>>>>> this kind of process.
>>>>>>
>>>>>> We identified that the major reason for the large size of 
>>>>>> MatchedRulesCache is that it contains duplicating String objects for the 
>>>>>> same MBean names. The object keeps all MBean names the rules were 
>>>>>> matched 
>>>>>> against previously, in order to avoid expensive pattern matching later. 
>>>>>> Although each rule caches the same set of MBean names, those are in fact 
>>>>>> separate objects in the heap. 
>>>>>>
>>>>>> The origin of the matter was found in 
>>>>>> https://github.com/prometheus/jmx_exporter/blob/a3dac9acee1464531cd87502579178a1fec1cc76/collector/src/main/java/io/prometheus/jmx/JmxCollector.java#L584
>>>>>>
>>>>>> where for each rule with caching enabled, there is a new String 
>>>>>> object created, although such could already exist in memory from a 
>>>>>> previous 
>>>>>> iteration.
>>>>>>
>>>>>> The duplication is by a factor of the number of rules with caching 
>>>>>> enabled.
>>>>>>
>>>>>> I experimented with interning the String, 
>>>>>>
>>>>>> *String matchName = (beanName + attributeName + ": " + 
>>>>>> matchBeanValue).intern();*
>>>>>>
>>>>>> so that the JVM's string pool is utilized and the String objects 
>>>>>> reused. The result in speed and heap space used by cache is described 
>>>>>> below:
>>>>>>
>>>>>> [image: Screenshot from 2024-12-12 14-53-26.png]
>>>>>>
>>>>>> What do you think about this and do you find this suggestion could 
>>>>>> have negative consequences in certain cases? 
>>>>>>
>>>>>> Kind regards,
>>>>>> Yana Ilieva
>>>>>>
>>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion visit 
>>> https://groups.google.com/d/msgid/prometheus-developers/6c2d05ad-6674-424e-afe3-3289b370b286n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/prometheus-developers/6c2d05ad-6674-424e-afe3-3289b370b286n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
>> To view this discussion visit 
>> https://groups.google.com/d/msgid/prometheus-developers/b8828a92-6422-4fb6-9103-e038e1c4b191n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-developers/b8828a92-6422-4fb6-9103-e038e1c4b191n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/prometheus-developers/e6e07d90-cdc8-433f-89df-db0e92b9da50n%40googlegroups.com.

Reply via email to