Re: RFR: 8326065: Merge Space into ContiguousSpace

2024-02-23 Thread Stefan Johansson
On Fri, 16 Feb 2024 17:00:15 GMT, Albert Mingkun Yang  wrote:

> Merging a super class into its single sub class.
> 
> Test: tier1-6

Looks good, feels like the files should change name as well. Now when there no 
longer is a `Space`. But maybe that's even better to do once we have moved 
`TenuredSpace` into Serial. 

-

Marked as reviewed by sjohanss (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/17891#pullrequestreview-1897558605


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v55]

2023-12-04 Thread Stefan Johansson
On Sat, 2 Dec 2023 07:37:26 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - Only create CPUTimeCounters if supported
>  - Ensure TTTC is destructed before publishing

Marked as reviewed by sjohanss (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1762364095


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48]

2023-12-04 Thread Stefan Johansson
On Fri, 1 Dec 2023 22:37:58 GMT, Jonathan Joo  wrote:

>> I think the ideal approach to simplify this is to support Atomic operation 
>> on a `PerfCounter`.
>> We could either introduce a `PerfAtomicCounter`/`PerfAtomicLongCounter` 
>> class, or perform `Atomic::add()` on the `PerfData::_valuep` pointer. 
>> There's already `PerfData::get_address()`, so we might be able to do:
>> 
>> 
>> Atomic::add((volatile jlong 
>> *)(instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->get_address()),
>>  net_cpu_time);
>> 
>> 
>> However, a new class `PerfAtomicCounter`  is likely cleaner. E.g., we may 
>> also want to make `PerfAtomicCounter::sample()` use a CAS. It is probably 
>> better to introduce `PerfAtomicCounter` in a separate RFE later.
>> 
>> Would the `Atomic::add()` with `PerfData::get_address()` approach be OK for 
>> now, or would we rather introduce a lock, or leave the `gc_total` mechanism 
>> as-is and address the out-of-sync-ness in a follow-up RFE?
>> 
>> IMO the out-of-sync-ness problem is minor, because users are likely to 
>> either look at a single `gc_total` counter, or look at each individual GC  
>> CPU counter and disregard `gc_total`.
>
> In the interest of the RDP1 deadline, should we leave improving the sync 
> issues with gc_total to a separate RFE? (Especially given that a "correct" 
> design may take some time to come up with, and that gc_total being slightly 
> out of sync is not a major issue.)

Me and Albert discussed this again and we are ok with handling the `gc_total` 
sync issue as a follow up. Please create the RFE for that. If that would 
include needing a `PerfAtomicCounter`, that would a be its own RFE as well. For 
me I think a lock would be a good enough solution. 

>From our point of view having the counters out of sync for a long period of 
>time (think a long concurrent mark cycle without any young collections 
>updating the total) is not good since it shows that the counters are not 
>incremented in sync. It would also be nice to avoid the two-step updating of 
>the total time, so please try to find time to work on this.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1413848440


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53]

2023-12-04 Thread Stefan Johansson
On Sat, 2 Dec 2023 01:22:33 GMT, Jonathan Joo  wrote:

>> I still think that a total counter is useful and I'd appreciate if you can 
>> keep it. To second what @caoman said before, it is GC agnostic, easy to use 
>> even for non GC experts and future proof with regards to implementation 
>> changes in the GCs. Please keep it.
>
> Put the closure in a scope, I think that should address the concern.

Yes, scoping it will work.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1413591437


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48]

2023-12-01 Thread Stefan Johansson
On Thu, 30 Nov 2023 21:43:36 GMT, Man Cao  wrote:

>> @simonis was the original suggester of this counter, so I will defer to his 
>> expertise. I do agree that dropping the counter would simplify things, but 
>> it also might not hurt to just leave it in. I'm okay with either option!
>
> Right, see @simonis 's comments at 
> https://github.com/openjdk/jdk/pull/15082#pullrequestreview-1613868256, 
> https://github.com/openjdk/jdk/pull/15082#discussion_r1321703912.
> 
> I initially had similar thought that `gc_total` isn't necessary and provides 
> redundant data. Now I agree with @simonis that the `gc_total` is valuable to 
> users. It saves users from extra work of aggregating different sets of 
> counters for different garbage collectors, and potential mistakes of missing 
> some counters. It is also future-proof that if GC implementation changes that 
> add additional threads, users wouldn't need to change their code to add the 
> counter for additional threads.
> 
> I think the maintenance overhead is quite small for `gc_total` since it is 
> mostly in this class. The benefit to users is worth it.

I agree that the counter is valuable if always up-to-date, but if it is out of 
sync compared to the "concurrent counters" I think it will confuse some users. 
So if we want to keep it I think we should try to keep it in sync. 

I suppose adding a lock for updating `gc_total` should be ok. In the safepoint 
case we should never contend on that lock and for the concurrent updates it 
should not be a big deal. Basically what I think would be to change 
`update_counter(...)` to do something like:

if (CPUTimeGroups::is_gc_counter(name)) {
  
  
instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->inc(net_cpu_time);
}


This way we would also be able to remove the publish part above, right? Any 
other problems with this approach?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1411897115


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v47]

2023-12-01 Thread Stefan Johansson
On Thu, 30 Nov 2023 01:43:28 GMT, Jonathan Joo  wrote:

>> (I just realized that I made a typo in my previous msg; should be *callee* 
>> instead.) That is what I have in mind.
>> 
>> 
>> void CPUTimeCounters::update_counter(name, total) {
>>   auto counter = get_counter(name);
>>   auto old_v = counter->get_value();
>>   auto diff = total - old_v;
>>   counter->inc(diff);
>>   if (counter->is_gc_counter()) {
>> counter->inc(diff);
>>   }
>> }
>
> I'm not sure I understood correctly, could you let me know if this latest 
> commit addresses your comment in the way you were intending?

It does.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1411876609


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53]

2023-12-01 Thread Stefan Johansson
On Thu, 30 Nov 2023 21:59:41 GMT, Man Cao  wrote:

>> src/hotspot/share/runtime/cpuTimeCounters.hpp line 59:
>> 
>>> 57:   NONCOPYABLE(CPUTimeCounters);
>>> 58: 
>>> 59:   static CPUTimeCounters* _instance;
>> 
>> I would prefer if we made the whole class static and got rid of the instance 
>> and just made the `_cpu_time_counters` array static. The only drawback I/we 
>> (discussed this with Albert as well) can see is that the memory for the 
>> array would not be accounted in NMT, but this array will always be very 
>> small so should not be a big problem. 
>> 
>> Do you see any other concerns?
>
> I thought it is typically preferred to initialize a singleton object on the 
> heap, rather than using several static variables. It is easier to control the 
> initialization order and timing of an on-heap singleton object than statics.
> 
> Moreover, for this class, `initialize()` could also check `if (UsePerfData)`, 
> and only create the singleton object under `UsePerfData`. This could save 
> some memory when `UsePerfData` is false.

I would say it depends on the use-case and here when switching to use static 
functions to use the instance it felt more like an all-static class. I agree 
that it would be nice to avoid the additional memory usage if `UsePerfData` is 
`false` so I'm ok with keeping the instance if we add that.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1411872139


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53]

2023-11-30 Thread Stefan Johansson
On Thu, 30 Nov 2023 02:45:45 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add missing include

src/hotspot/share/runtime/cpuTimeCounters.cpp line 47:

> 45:   return "conc_dedup";
> 46: default:
> 47:   ShouldNotReachHere();

My IDE complained and I guess depending on warning level we might need a return 
here.
Suggestion:

  ShouldNotReachHere();
  return "";

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1410454940


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53]

2023-11-30 Thread Stefan Johansson
On Thu, 30 Nov 2023 02:45:45 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add missing include

Few more comments after the latest changes.

src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 905:

> 903:   gc_threads_do(&tttc);
> 904: 
> 905:   CPUTimeCounters::publish_gc_total_cpu_time();

As I suggested in the other comment, maybe we should not keep the total 
counter, but if we do we need to make sure the destructor of the closure is run 
before the call to `publish_gc_total_cpu_time()`. Otherwise we will publish a 
not yet updated value.

src/hotspot/share/runtime/cpuTimeCounters.hpp line 59:

> 57:   NONCOPYABLE(CPUTimeCounters);
> 58: 
> 59:   static CPUTimeCounters* _instance;

I would prefer if we made the whole class static and got rid of the instance 
and just made the `_cpu_time_counters` array static. The only drawback I/we 
(discussed this with Albert as well) can see is that the memory for the array 
would not be accounted in NMT, but this array will always be very small so 
should not be a big problem. 

Do you see any other concerns?

-

Changes requested by sjohanss (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1757031262
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1410415366
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1410410265


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48]

2023-11-30 Thread Stefan Johansson
On Thu, 30 Nov 2023 06:45:02 GMT, Albert Mingkun Yang  wrote:

>> I agree that in the case that multiple closures are not active at the same 
>> time, we wouldn't have to implement it in this way. However, isn't it 
>> possible to have multiple closures active simultaneously, e.g.  vm thread, 
>> concurrent mark thread, concurrent refine thread? I think to account for the 
>> races there, we can only update the `_gc_total_cpu_time_diff` member 
>> variable atomically during these closure destructions, and then publish the 
>> actual updated `gc_total` counter at manually specified times via 
>> `publish_gc_total_cpu_time()`. If we were to call 
>> `publish_gc_total_cpu_time` as part of the thread closure, I believe it 
>> would be difficult to prevent races when accessing the underlying counter 
>> from the various gc-related threads.
>> 
>> Or maybe there is another strategy that I'm not seeing?
>
> Both `publish_gc_total_cpu_time` and `~ThreadTotalCPUTimeClosure` are called 
> by the vm-thread inside a safepoint, so there shouldn't be any other threads 
> running simultaneously, I believe.

Me and Albert just spoke and we do see the problem that two concurrent threads 
could be executing the closure at the same time. So if having a total counter 
we need to sync the updates. But when talking we started to question how useful 
it is to have the gc_total counter. It is just an aggregate of the other 
gc-counters, but it is out of sync between safepoints. So you will always get a 
more accurate value by looking at the individual gc-counters.

We came to the conclusion that it would probably be easier to drop `gc_total` 
right now, rather than trying to keep it in sync for all updates to the 
individual counters. Because having them out of sync doesn't feel like a great 
option. 

Are we missing anything or do you agree?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1410394993


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48]

2023-11-29 Thread Stefan Johansson
On Tue, 28 Nov 2023 02:22:45 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - Fix namespace issues (2)
>    
>Co-authored-by: Stefan Johansson 
> <54407259+kstef...@users.noreply.github.com>
>  - Fix namespace issues
>
>Co-authored-by: Stefan Johansson 
> <54407259+kstef...@users.noreply.github.com>

>From a testing point of view I think this is looking good now, re-ran the 
>failing test and some other jstat tests as well and they all pass.

Please address the comments from Albert and we can hopefully finish this before 
RDP1.

src/hotspot/share/runtime/cpuTimeCounters.cpp line 91:

> 89:   } while (old_value != fetched_value);
> 90:   get_counter(CPUTimeGroups::CPUTimeType::gc_total)->inc(fetched_value);
> 91: }

Why do we have to do this publish dance? Couldn't the closure that update the 
diff instead just update the counter? From what I can tell we never have 
multiple closures active at the same time so should be no race there?

-

PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1754686415
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1408914724


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v47]

2023-11-29 Thread Stefan Johansson
On Thu, 23 Nov 2023 13:46:54 GMT, Albert Mingkun Yang  wrote:

>> Jonathan Joo has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Cleanup and address comments
>
> src/hotspot/share/runtime/cpuTimeCounters.cpp line 119:
> 
>> 117: if (CPUTimeGroups::is_gc_counter(_name)) {
>> 118:   instance->inc_gc_total_cpu_time(net_cpu_time);
>> 119: }
> 
> I feel much of this is on the wrong abstraction level; 
> `CPUTimeCounters::update_counter(_name, _total);` should be sufficient here. 
> (The caller handles diff calculation and inc gc-counter if needed.)

We could add a new closure just used by GC that 's a sub-class of 
`ThreadTotalCPUTimeClosure` and just adds this to the constructor: 

instance->inc_gc_total_cpu_time(net_cpu_time);


That way we could get rid of `CPUTimeGroups::is_gc_counter()` as well since all 
those counters should use the "GC closure" or we can keep it and assert that no 
GC closure uses the wrong closure.

What do you think about that Albert, would that address your concerns?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1408923331


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v47]

2023-11-27 Thread Stefan Johansson
On Wed, 22 Nov 2023 23:08:36 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Cleanup and address comments

My additional testing found a failure when running: 
`sun/tools/jstat/jstatSnap1.sh`

When running `jstat -snap 0` (which is what the test does) with this change the 
new counters are displayed even though they belong to the unsupported namespace 
"sun.". From what I can tell, the reason for this is that when creating a new 
subsystem namespace we need to create three of them, one for each top-level 
namespace. Otherwise the code trying to figure out if this is supported or not 
is tripped over. 

I've suggested two changes to do this and with those the test in question no 
longer fails. 

It would be good if someone better familiar with jstat/perf-counter code 
verifies that this is how it should be solved.

src/hotspot/share/runtime/perfData.cpp line 76:

> 74:   "com.sun.threads",
> 75:   "sun.threads",
> 76:   "sun.threads.cpu_time",  // Subsystem for Sun Threads CPU times

Suggestion:

  "java.threads.cpu_time", //Thread CPU time name spaces
  "com.sun.threads.cpu_time",
  "sun.threads.cpu_time",

src/hotspot/share/runtime/perfData.hpp line 64:

> 62:   COM_THREADS,
> 63:   SUN_THREADS,
> 64:   SUN_THREADS_CPUTIME,  // Subsystem for Sun Threads CPU times

Suggestion:

  JAVA_THREADS_CPUTIME, // Thread CPU time name spaces
  COM_THREADS_CPUTIME,
  SUN_THREADS_CPUTIME,

-

Changes requested by sjohanss (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1750253595
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1406050140
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1406047123


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v47]

2023-11-23 Thread Stefan Johansson
On Wed, 22 Nov 2023 23:08:36 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Cleanup and address comments

I think this looks good now. I've started some additional testing to make sure 
we don't run into anything unexpected with the newly added test on any 
platforms. 

Will approve once the testing is green.

src/hotspot/share/runtime/cpuTimeCounters.hpp line 2:

> 1: /*
> 2:  * Copyright (c) 2002, 2023, Oracle and/or its affiliates. All rights 
> reserved.

This is probably on me, but this 2002 is wrong:
Suggestion:

 * Copyright (c) 2023, Oracle and/or its affiliates. All rights reserved.

-

PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1745951725
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1403070166


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v42]

2023-11-15 Thread Stefan Johansson
On Tue, 14 Nov 2023 21:33:37 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Update parallel workers time after Remark

Thanks for addressing my comments. I have a few more things:

- I think all changes to `test_g1ServiceThread.cpp` can be reverted. Should not 
be needed now
- Please fix all whitespace issues
- Should we move the VMThread and StringDedup counters into `CPUTimeCounters` 
as well? Any problem with this?

-

PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1812105608


Re: RFR: 8318706: Implement JEP 423: Region Pinning for G1 [v16]

2023-11-10 Thread Stefan Johansson
On Thu, 9 Nov 2023 10:44:29 GMT, Thomas Schatzl  wrote:

>> The JEP covers the idea very well, so I'm only covering some implementation 
>> details here:
>> 
>> * regions get a "pin count" (reference count). As long as it is non-zero, we 
>> conservatively never reclaim that region even if there is no reference in 
>> there. JNI code might have references to it.
>> 
>> * the JNI spec only requires us to provide pinning support for typeArrays, 
>> nothing else. This implementation uses this in various ways:
>> 
>>   * when evacuating from a pinned region, we evacuate everything live but 
>> the typeArrays to get more empty regions to clean up later.
>> 
>>   * when formatting dead space within pinned regions we use filler objects. 
>> Pinned regions may be referenced by JNI code only, so we can't overwrite 
>> contents of any dead typeArray either. These dead but referenced typeArrays 
>> luckily have the same header size of our filler objects, so we can use their 
>> headers for our fillers. The problem is that previously there has been that 
>> restriction that filler objects are half a region size at most, so we can 
>> end up with the need for placing a filler object header inside a typeArray. 
>> The code could be clever and handle this situation by splitting the to be 
>> filled area so that this can't happen, but the solution taken here is 
>> allowing filler arrays to cover a whole region. They are not referenced by 
>> Java code anyway, so there is no harm in doing so (i.e. gc code never 
>> touches them anyway).
>> 
>> * G1 currently only ever actually evacuates young pinned regions. Old pinned 
>> regions of any kind are never put into the collection set and automatically 
>> skipped. However assuming that the pinning is of short length, we put them 
>> into the candidates when we can.
>> 
>>   * there is the problem that if an applications pins a region for a long 
>> time g1 will skip evacuating that region over and over. that may lead to 
>> issues with the current policy in marking regions (only exit mixed phase 
>> when there are no marking candidates) and just waste of processing time 
>> (when the candidate stays in the retained candidates)
>> 
>> The cop-out chosen here is to "age out" the regions from the candidates 
>> and wait until the next marking happens.
>> 
>> I.e. pinned marking candidates are immediately moved to retained 
>> candidates, and if in total the region has been pinned for 
>> `G1NumCollectionsKeepUnreclaimable` collections it is dropped from the 
>> candidates. Its current value is fairly random.
>> 
>> * G1 pauses got a new tag if there were pinned regions in the collection 
>> set. I.e. in a...
>
> Thomas Schatzl has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Modify evacuation failure log message as suggested by sjohanss: Use 
> "Evacuation Failure" with a cause description (either "Allocation" or 
> "Pinned")

Looks good. Just a few small things.

src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 466:

> 464:   if (region_attr.is_pinned() && klass->is_typeArray_klass()) {
> 465: return handle_evacuation_failure_par(old, old_mark, word_sz, true /* 
> cause_pinned */);
> 466:   }

I think it would make sense to add a comment here that this is an optimization, 
that we know that only type array can be pinned so we only care if the region 
is pinned or not for type arrays.

src/hotspot/share/gc/g1/g1YoungGCPostEvacuateTasks.cpp line 925:

> 923:   
> _evac_failure_regions->num_regions_alloc_failed(),
> 924:   
> G1GCPhaseTimes::RestoreEvacFailureRegionsAllocFailedNum);
> 925: 

The counts used here are the totals, but they are added to thread work item, so 
total count will be wrong.

Discussed this a bit with Thomas and we should probably count those things in 
this task instead and could that way make `G1EvacFailureRegions` store a bit 
less information.

src/hotspot/share/gc/shared/collectedHeap.hpp line 169:

> 167:   static inline size_t filler_array_hdr_size();
> 168: public:
> 169:   static size_t filler_array_min_size();

Is this needed or some leftover from earlier iterations?

-

Marked as reviewed by sjohanss (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16342#pullrequestreview-1722530941
PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1389227404
PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1389211990
PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1387998596


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v40]

2023-11-09 Thread Stefan Johansson
On Thu, 9 Nov 2023 05:27:40 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add missing cpuTimeCounters files

One more thing I noticed is that the parallel worker time from the Remark pause 
is not updated after the remark pause. So I guess we should add a call to 
update that. It will still be account the next time we do a normal GC, but if 
we want the counter to be up to date as much as possible we should update it 
after the remark pause.

-

PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1803629858


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v40]

2023-11-09 Thread Stefan Johansson
On Thu, 9 Nov 2023 05:27:40 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add missing cpuTimeCounters files

src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 2093:

> 2091:   tttc.do_thread(cm_thread());
> 2092:   threads_do(&tttc);
> 2093: }

Any particular reason for having this in `G1ConcurrentMark` instead of keeping 
it where it is called in `G1ConcurrentMarkThread`. The implementation would be 
more or less the same apart from the two last lines:

  tttc.do_thread(this);
  cm()->threads_do(&tttc);

-

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1387839879


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v40]

2023-11-09 Thread Stefan Johansson
On Thu, 9 Nov 2023 05:27:40 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add missing cpuTimeCounters files

A few more comments.

src/hotspot/share/gc/g1/g1ConcurrentRefineThread.cpp line 82:

> 80:   _vtime_accum = 0.0;
> 81: }
> 82: maybe_update_threads_cpu_time();

I think the lines above here: 

if (os::supports_vtime()) {
  _vtime_accum = (os::elapsedVTime() - _vtime_start);
} else {
  _vtime_accum = 0.0;
}

Should be extracted out into the new method and instead of calling it 
`maybe_update_threads_cpu_time()` just call `track_usage()` or 
`track_cpu_time()`. The the implementation in the primary thread can then call 
this and do the extra tracking.

src/hotspot/share/gc/g1/g1ConcurrentRefineThread.cpp line 189:

> 187: void G1PrimaryConcurrentRefineThread::maybe_update_threads_cpu_time() {
> 188:   if (UsePerfData && os::is_thread_cpu_time_supported()) {
> 189: cr()->update_concurrent_refine_threads_cpu_time();

I think we should pull the tracking closure in here and that way leave the 
concurrent refine class untouched. 

Suggestion:

// The primary thread is responsible for updating the CPU time for all 
workers.
CPUTimeCounters* counters = G1CollectedHeap::heap()->cpu_time_counters();
ThreadTotalCPUTimeClosure tttc(counters, CPUTimeGroups::gc_conc_refine);
cr()->threads_do(&tttc);


This is more or less a copy from 
`G1ConcurrentRefineThreadControl::update_threads_cpu_time()` which if we go 
with this solution can be removed. The above needs some new includes though.

I change the comment a because I could not fully understand it, the primary 
thread is the one always checking and starting more threads so it is not 
stopped first. Also not sure when a terminated thread could be read. Even the 
stopped threads are still present so should be fine. If I'm missing something 
feel free to add back the comment.

src/hotspot/share/gc/g1/g1ServiceThread.cpp line 138:

> 136: ThreadTotalCPUTimeClosure tttc(counters, CPUTimeGroups::gc_service);
> 137: tttc.do_thread(task->_service_thread);
> 138:   }

Please extract this to a function, similar to the other cases something like 
`track_cpu_time()`.

-

Changes requested by sjohanss (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1722194318
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1387787912
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1387803508
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1387805665


Re: RFR: 8318706: Implement JEP 423: Region Pinning for G1 [v9]

2023-11-08 Thread Stefan Johansson
On Fri, 3 Nov 2023 14:14:36 GMT, Albert Mingkun Yang  wrote:

>> Parsing the separate components is easier :) Not sure if these tags in any 
>> way ever indicated some level of abstraction.
>> 
>> I do not have a strong opinion here. The combinations
>> 
>> (Pinned)
>> (Allocation Failure)
>> (Pinned + Allocation Failure)  // or the other way around, or some other 
>> symbol for "+" or no symbol at all?
>> 
>> are fine with me (and I thought about doing something more elaborate here), 
>> but my concern has been that any complicated string makes it less unique 
>> (e.g. `(Allocation Failure)` vs. "Allocation Failure") and adds code both to 
>> implement and parse the result.
>> 
>> Much more disrupting is likely that there is no "Evacuation Failure" string 
>> any more. But log messages are not part of the external interface, and we 
>> should not want to change them just because.
>
> The example looks good to me.

Have the final output looking something like this was agreed on during internal 
discussion:
GC(6) Pause Young (Normal) (Evacuation Failure: Pinned) 1M->1M(22M) 36.16ms
GC(6) Pause Young (Normal) (Evacuation Failure: Allocation) 1M->1M(22M) 36.16ms
GC(6) Pause Young (Normal) (Evacuation Failure: Allocation / Pinned) 
1M->1M(22M) 36.16ms

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16342#discussion_r1386736860


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v35]

2023-11-07 Thread Stefan Johansson
On Tue, 7 Nov 2023 01:06:12 GMT, Man Cao  wrote:

> I think it looks great. It is mainly refactoring that consolidates the 
> declarations/definitions of the hsperf counters in to a single file. Would it 
> be better to name that class `CPUTimeCounters`, so we could move 
> `sun.threads.cpu_time.vm` and `sun.threads.cpu_time.conc_dedup`, and future 
> JIT thread CPU counters to that class?
>
Yes, mainly refactoring and I was thinking along the same lines, but since this 
patch was just for GC and we had `CollectorCounters` already I went with this. 
I think calling the class `CPUTimeCounters` would be good and place it outside 
GC makes sense if we plan to include even more CPU time counters.

Another name that we could improve is `CPUTimeGroups` and maybe also the enum 
name `Name`, they are ok, but we might come up with something better. 

 
> Then we could also change the constructor of `ThreadTotalCPUTimeClosure` to 
> `ThreadTotalCPUTimeClosure(CPUTimeCounters* counters, CPUTimeGroups::Name 
> name)`, then it could set `_update_gc_counters` based on `name`.

I was looking at this too, but had to restructure the code more to avoid 
circular deps. If we create the general `CPUTImeCounters` we could move this 
closure to that file and then things would fit better I believe. So I like your 
proposals.

-

PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1798170871


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v37]

2023-11-06 Thread Stefan Johansson
On Wed, 1 Nov 2023 23:56:25 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - revert gitignore change
>  - Attempt to fix broken test

> The existing sun.management:type=HotspotThreading MBean approach (discussed 
> [here](https://mail.openjdk.org/pipermail/core-libs-dev/2023-September/111397.html))
>  could be another general way to track CPU. However, the discussion concludes 
> that it is an internal API, and discourages users from using it.

> @kstefanj , it is a pity that the `sun.management:type=HotspotThreading` 
> MBean is not exported any more. If we move that or a similar functionality to 
> a new MBean under `com.sun.management` (as proposed in the [cited 
> discussion](https://mail.openjdk.org/pipermail/core-libs-dev/2023-September/111397.html))
>  then we might reuse these new hsperf counters in the same way this is 
> already done by some other MBeans which already use hsperf counters as their 
> information source. I think logging or JFR functionality could also easily be 
> implemented on top of the new hsperf counters.

I haven't looked at the details around this, but extracting some useful 
information from the internal bean and making it public sound reasonable to me.

-

PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1794693970


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v35]

2023-11-06 Thread Stefan Johansson
On Thu, 2 Nov 2023 01:20:59 GMT, Jonathan Joo  wrote:

> Could you elaborate a bit on what you were thinking of here? If we are 
> assuming something like a thread that updates all other threads, I think this 
> implementation could get a bit complicated.
> 
> There are two main issues that we can see with a generic thread approach:
> 
> 1. We would have to figure out how often to pull metrics from the various 
> gc threads from the central thread, and possibly determine this frequency 
> separately for every thread. Instead with our current implementation, we can 
> manually trigger publishes based on when the GC thread is done doing work.
> 
> 2. We would still need to tag each thread we want to track somewhere, and 
> keep track of a mapping from thread to its counter name, etc. which doesn't 
> seem to simplify things too much. (I imagine we will still need to touch 
> numerous files to "tag" each thread with whether we want to track it or not?)

I agree, I was not thinking about having a separate thread, more about trying 
to group this information in a way that it would be easier, for example, to 
provide periodic JFR events for the CPU times collected. Having something like 
a `CollectorCPUTimeCounters` (we already have `CollectorCounters`). Such a 
class could keep the different counters making it easier to get an overview of 
which CPU time counters are present. But this would also require a mapping 
between thread and counter (but it might be as simple as having an enum). 

I played around a bit instead of trying to explain what I mean and this is not 
very polished, but I was thinking something like this: 
https://github.com/openjdk/jdk/compare/pr/15082...kstefanj:jdk:pull/15082-idea

What do you think? This way we don't add things to `CollectedHeap` as well, 
which is usually good unless really needed.

-

PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1794686146


Re: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v35]

2023-11-01 Thread Stefan Johansson
On Tue, 31 Oct 2023 04:23:13 GMT, Jonathan Joo  wrote:

>> 8315149: Add hsperf counters for CPU time of internal GC threads
>
> Jonathan Joo has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Replace NULL with nullptr

Sorry for being a bit late to this PR. I think the addition of CPU time 
tracking is good, but I wonder if we could do it in a way that is a bit more 
general. A more general way of tracking CPU time for a set of threads and we 
could then have different consumers of this data. In addition to hsperf 
counters I think having logging and JFR events for this could be interesting as 
well. 

Have you had any thought along those lines, any obvious problems with such 
approach?

-

PR Comment: https://git.openjdk.org/jdk/pull/15082#issuecomment-1788665727


Integrated: 8307448: Test RedefineSharedClassJFR fail due to wrong assumption

2023-05-04 Thread Stefan Johansson
On Thu, 4 May 2023 12:08:28 GMT, Stefan Johansson  wrote:

> Please review this fix to avoid a tier1 test failure.
> 
> **Summary**
> The newly added test wrongfully assumed that one of the transformed classes 
> would be in use. This seem to be the case in most runs, but there is no 
> guarantee. This change fixes the test to only verify that nothing is seen as 
> shared when running with -Xshare:off.
> 
> **Testing**
> Verified that the test still passes locally.

This pull request has now been integrated.

Changeset: 29233e0a
Author:Stefan Johansson 
URL:   
https://git.openjdk.org/jdk/commit/29233e0a001adde71a3fa5d56292ccfba8409ea5
Stats: 10 lines in 1 file changed: 1 ins; 3 del; 6 mod

8307448: Test RedefineSharedClassJFR fail due to wrong assumption

Reviewed-by: eosterlund, coleenp

-

PR: https://git.openjdk.org/jdk/pull/13801


Re: RFR: 8307448: Test RedefineSharedClassJFR fail due to wrong assumption

2023-05-04 Thread Stefan Johansson
On Thu, 4 May 2023 12:36:27 GMT, Coleen Phillimore  wrote:

>> Please review this fix to avoid a tier1 test failure.
>> 
>> **Summary**
>> The newly added test wrongfully assumed that one of the transformed classes 
>> would be in use. This seem to be the case in most runs, but there is no 
>> guarantee. This change fixes the test to only verify that nothing is seen as 
>> shared when running with -Xshare:off.
>> 
>> **Testing**
>> Verified that the test still passes locally.
>
> Thanks for fixing this so quickly.

Thanks @coleenp and @fisk for the quick reviews. 

Since the test failure shows up in tier 1 I will push this without waiting 24 
hours.

-

PR Comment: https://git.openjdk.org/jdk/pull/13801#issuecomment-1534718633


RFR: 8307448: Test RedefineSharedClassJFR fail due to wrong assumption

2023-05-04 Thread Stefan Johansson
Please review this fix to avoid a tier1 test failure.

**Summary**
The newly added test wrongfully assumed that one of the transformed classes 
would be in use. This seem to be the case in most runs, but there is no 
guarantee. This change fixes the test to only verify that nothing is seen as 
shared when running with -Xshare:off.

**Testing**
Verified that the test still passes locally.

-

Commit messages:
 - Updated comment
 - 8307448: Test RedefineSharedClassJFR fail due to wrong assumption

Changes: https://git.openjdk.org/jdk/pull/13801/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13801&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8307448
  Stats: 10 lines in 1 file changed: 1 ins; 3 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/13801.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/13801/head:pull/13801

PR: https://git.openjdk.org/jdk/pull/13801


Integrated: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared

2023-05-04 Thread Stefan Johansson
On Fri, 28 Apr 2023 12:48:44 GMT, Stefan Johansson  wrote:

> Hi all, 
> 
> Please review this change to avoid CleanClassLoaderDataMetaspaces safepoint 
> when there is nothing that can be cleaned up.
> 
> **Summary**
> When transforming/redefining classes a previous version list is linked 
> together in the InstanceKlass. The original class is added to this list if it 
> is still used or shared. The difference between shared and used is not 
> currently noted. This leads to a problem when doing concurrent class 
> unloading, because during that we postpone some potential work to a safepoint 
> (since we are not in one). This is the CleanClassLoaderDataMetaspaces and it 
> is triggered by the ServiceThread if there is work to be done, for example if 
> InstanceKlass::_has_previous_versions is true.
> 
> Since we currently does not differentiate between shared and "in use" we 
> always set _has_previous_versions if anything is on this list. This together 
> with the fact that shared previous versions should never be cleaned out leads 
> to this safepoint being triggered after every concurrent class unloading even 
> though there is nothing that can be cleaned out.
> 
> This can be avoided by making sure the _previous_versions list is only 
> cleaned when there are non-shared classes on it. This change renames 
> `_has_previous_versions` to `_clean_previous_versions` and only updates it if 
> we have non-shared classes on the list.
> 
> **Testing**
> * A lot of manual testing verifying that we do get the safepoint when we 
> should. 
> * Added new test to verify expected behavior by parsing the logs. The test 
> uses JFR to trigger redefinition of some shared classes (when -Xshare:on).
> * Mach5 run of new test and tier 1-3

This pull request has now been integrated.

Changeset: 408cec51
Author:Stefan Johansson 
URL:   
https://git.openjdk.org/jdk/commit/408cec516bb5fd82fb6dcddeee934ac0c5ecffaf
Stats: 150 lines in 6 files changed: 127 ins; 3 del; 20 mod

8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions 
are shared

Reviewed-by: coleenp, sspitsyn

-

PR: https://git.openjdk.org/jdk/pull/13716


Re: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared [v2]

2023-05-04 Thread Stefan Johansson
On Fri, 28 Apr 2023 15:57:49 GMT, Coleen Phillimore  wrote:

>> Stefan Johansson has updated the pull request incrementally with two 
>> additional commits since the last revision:
>> 
>>  - Test refactor
>>  - Serguei review
>
> This looks good. Thanks for all the testing and adding the new test.

Thanks again @coleenp and @sspitsyn for the reviews.

-

PR Comment: https://git.openjdk.org/jdk/pull/13716#issuecomment-1534557035


Re: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared [v2]

2023-05-03 Thread Stefan Johansson
> Hi all, 
> 
> Please review this change to avoid CleanClassLoaderDataMetaspaces safepoint 
> when there is nothing that can be cleaned up.
> 
> **Summary**
> When transforming/redefining classes a previous version list is linked 
> together in the InstanceKlass. The original class is added to this list if it 
> is still used or shared. The difference between shared and used is not 
> currently noted. This leads to a problem when doing concurrent class 
> unloading, because during that we postpone some potential work to a safepoint 
> (since we are not in one). This is the CleanClassLoaderDataMetaspaces and it 
> is triggered by the ServiceThread if there is work to be done, for example if 
> InstanceKlass::_has_previous_versions is true.
> 
> Since we currently does not differentiate between shared and "in use" we 
> always set _has_previous_versions if anything is on this list. This together 
> with the fact that shared previous versions should never be cleaned out leads 
> to this safepoint being triggered after every concurrent class unloading even 
> though there is nothing that can be cleaned out.
> 
> This can be avoided by making sure the _previous_versions list is only 
> cleaned when there are non-shared classes on it. This change renames 
> `_has_previous_versions` to `_clean_previous_versions` and only updates it if 
> we have non-shared classes on the list.
> 
> **Testing**
> * A lot of manual testing verifying that we do get the safepoint when we 
> should. 
> * Added new test to verify expected behavior by parsing the logs. The test 
> uses JFR to trigger redefinition of some shared classes (when -Xshare:on).
> * Mach5 run of new test and tier 1-3

Stefan Johansson has updated the pull request incrementally with two additional 
commits since the last revision:

 - Test refactor
 - Serguei review

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/13716/files
  - new: https://git.openjdk.org/jdk/pull/13716/files/39c3a1c1..834174f9

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=13716&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13716&range=00-01

  Stats: 47 lines in 5 files changed: 13 ins; 2 del; 32 mod
  Patch: https://git.openjdk.org/jdk/pull/13716.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/13716/head:pull/13716

PR: https://git.openjdk.org/jdk/pull/13716


Re: RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared [v2]

2023-05-03 Thread Stefan Johansson
On Fri, 28 Apr 2023 15:57:49 GMT, Coleen Phillimore  wrote:

>> Stefan Johansson has updated the pull request incrementally with two 
>> additional commits since the last revision:
>> 
>>  - Test refactor
>>  - Serguei review
>
> This looks good. Thanks for all the testing and adding the new test.

Thanks for the reviews @coleenp and @sspitsyn.

Pushed two changes according to Sergueis suggestions.

-

PR Comment: https://git.openjdk.org/jdk/pull/13716#issuecomment-1532646363


RFR: 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous versions are shared

2023-04-28 Thread Stefan Johansson
Hi all, 

Please review this change to avoid CleanClassLoaderDataMetaspaces safepoint 
when there is nothing that can be cleaned up.

**Summary**
When transforming/redefining classes a previous version list is linked together 
in the InstanceKlass. The original class is added to this list if it is still 
used or shared. The difference between shared and used is not currently noted. 
This leads to a problem when doing concurrent class unloading, because during 
that we postpone some potential work to a safepoint (since we are not in one). 
This is the CleanClassLoaderDataMetaspaces and it is triggered by the 
ServiceThread if there is work to be done, for example if 
InstanceKlass::_has_previous_versions is true.

Since we currently does not differentiate between shared and "in use" we always 
set _has_previous_versions if anything is on this list. This together with the 
fact that shared previous versions should never be cleaned out leads to this 
safepoint being triggered after every concurrent class unloading even though 
there is nothing that can be cleaned out.

This can be avoided by making sure the _previous_versions list is only cleaned 
when there are non-shared classes on it. This change renames 
`_has_previous_versions` to `_clean_previous_versions` and only updates it if 
we have non-shared classes on the list.

**Testing**
* A lot of manual testing verifying that we do get the safepoint when we 
should. 
* Added new test to verify expected behavior by parsing the logs. The test uses 
JFR to trigger redefinition of some shared classes (when -Xshare:on).
* Mach5 run of new test and tier 1-3

-

Commit messages:
 - Add test wih JFR
 - 8306929: Avoid CleanClassLoaderDataMetaspaces safepoints when previous 
versions are shared

Changes: https://git.openjdk.org/jdk/pull/13716/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13716&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8306929
  Stats: 139 lines in 6 files changed: 116 ins; 3 del; 20 mod
  Patch: https://git.openjdk.org/jdk/pull/13716.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/13716/head:pull/13716

PR: https://git.openjdk.org/jdk/pull/13716


Re: [External] : Re: Extend Native Memory Tracking over the JDK ? (was: Proposal: track zlib native memory usage with NMT)

2022-12-01 Thread Stefan Johansson

Hi Thomas and Carter,

I opened up a PR for this to allow more specific comments on the 
implementation:

https://github.com/openjdk/jdk/pull/11449

If this discussion leads to us not wanting to proceed with the change I 
will withdraw the PR.


Some more comments below.

On 2022-12-01 08:26, Thomas Stüfe wrote:

Hi Carter, Stefan,

thank you, I think it is good to have this discussion, it is important.

Side note, the discussion steered away from my original question - 
whether to instrument the JDK with NMT. I still would love to discuss 
that, too.




Sorry for that :)

About opening NMT up for user consumption, that is of course possible. 
But I think the bigger question is which data we want to open for user 
consumption, and at what granularity. And what contracts do we enter 
when we do this.




To me this is not so much opening it up, but just making it much simpler 
to get the already available data (JFR instead of jcmd). I get your 
point that when we make it easier it will likely get more visibility and 
that could generate expectations. To me the contract on these events 
should not be much harder than, for example, the contract we have on the 
format of GC logs. So we should not be locked down by this.


NMT was originally a hotspot-dev-centric tool. It has a lot of 
idiosyncrasies. Interpreting the results needs detailed knowledge about 
hotspot memory management. Some examples:


- its reports are not consistent across JDK versions, not even across 
different patch levels of the same JDK. So you cannot compare results, 
say, between JDK11 and 17.
- before a certain version X (I believe JDK 11), the full thread stacks 
were accounted for instead of just the in-use portion of the thread 
stacks. I remember reading blogs about how thread stack consumption went 
down when all that changed was NMT reporting.
- The memory sizes it shows may not have much to do with real RSS. It 
systematically underreports some things, since it omits libc overhead 
and retention, usage by system- and JNI libraries. But it also 
overreports things since it mostly (not always) accounts in terms of 
"committed" memory, which usually means mmap()ed or malloc()ed memory. 
But that is just committed, not physical memory, it does not translate 
to RSS usage directly. That memory may never be touched. OTOH NMT probes 
thread stacks with mincore(), so for that section, "committed" really 
means "physical".




I agree that NMT is a low-level tool and that it's not perfect. But in 
some cases I think it's the best way to see the memory consumption of 
the JVM. Especially since you can zoom in on certain areas.


I am fine with opening up NMT via JFR. But does this mean we have to be 
more consistent? Do we have to care about downward compatibility of NMT 
reports? Are we then still free to redesign the tag system (see my 
original mail) or will this tie us down with the current NMT tag system 
forever? As a negative example, JFR exposes metaspace allocator details 
(chunk statistics) which have been broken ever since JDK 16 when the 
underlying implementation changed.




I think a tag based system for NMT would be awesome and it would be 
really sad if exposing the NMT information through JFR would stop us 
from doing this. Hopefully the only thing we need to do when improving 
NMT is to do CSRs. One possible way to avoid constraints even more would 
be to tag those events as "experimental" at first. This would signal 
that user should not rely on them.



Therefore I am curious about what end users use NMT really for.

@Carter: can you give us examples of which NMT sections had been 
particularly useful to you? Maybe we can define a subset to expose 
instead of exposing all tags. E.g. I can see thread stack usage being 
very useful, but things like ObjectMonitor footprint not so much.




I agree that not to many users would care about the ObjectMonitor 
footprint, but unless we get constrained by what we report I would like 
to report all. If there are constraints, this might be a good middle road.


Thanks,
Stefan


Cheers, Thomas




On Wed, Nov 30, 2022 at 9:45 PM Carter Kozak <mailto:cko...@ckozak.net>> wrote:


__
This looks fantastic, thank you so much! I can confirm that the
proposed
design would solve my use-case.

I'd enjoy discussing the NMT event  contract somewhere more specific
to the implementation, but I don't want to muddle this thread with
implementation details.

    Carter Kozak

On Wed, Nov 30, 2022, at 03:37, Stefan Johansson wrote:

Hi Carter,

Your mail made me pick up an old item from my wishlist: to have
native
memory tracking information available in JFR recordings. When we,
in GC,
do improvements to decrease the native memory overhead of our
algorithms, NMT is a very good tool to track the progress. We have
scripts that sound very similar to what you des

Re: Extend Native Memory Tracking over the JDK ? (was: Proposal: track zlib native memory usage with NMT)

2022-11-30 Thread Stefan Johansson

Hi Carter,

Your mail made me pick up an old item from my wishlist: to have native 
memory tracking information available in JFR recordings. When we, in GC, 
do improvements to decrease the native memory overhead of our 
algorithms, NMT is a very good tool to track the progress. We have 
scripts that sound very similar to what you describe and more than once 
I've been thinking about adding this information into JFR. But it has 
not been a priority and the greater value has been unclear.


Hearing that others might also benefit from such a change I took a 
discussion with the JFR team on how to best proceed with this. I have 
created a branch for this and will probably create a PR for it shortly, 
but I thought I would drop it here first:

https://github.com/kstefanj/jdk/tree/8157023-jfr-events-for-nmt

The change adds two new JFR events: one for the total usage and one for 
the usage of each memory type. These are sent only if Native Memory 
Tracking is turned on, and they are enabled in the default JFR profile 
with an interval of 1s. This might change during reviewing but it was a 
good starting point.


With this you will be able to use JFR streaming to access the events 
from within your running process. I hope this will help your use cases 
and please let us know if you have any comments or suggestions.


Thanks,
Stefan


On 2022-11-10 16:58, Carter Kozak wrote:

/+serviceability-dev/

Firstly, thank you both for your time and work in this space. Apologies 
if this should be a separate thread, but the new title “Extend Native 
Memory Tracking over the JDK” aligns directly with some work I’ve been 
investigating, and I hope my feedback will be helpful for prioritization 
of zlib observability as well as the way users think about native memory 
tracking in general.


Observability of native memory in the JVM is critically important, and 
becomes even more valuable as the industry shifts to more and smaller 
services deployed in right-sized container environments like kubernetes. 
Each new JDK release (major and hotfix) offers dramatic improvements, 
often based on some form of trade-off. To be clear, I cannot overstate 
how impressed I am with quality and velocity of improvement! However, 
these trade-offs impact the way that memory is used, and it’s a 
difficult balance to ensure containers use the correct amount of memory 
without being wasteful (over-provisioned) or oomkilled (under-provisioned).


In production, I have thousands of JVMs running with native memory 
tracking summary enabled. Real-time monitoring of the output is painful 
and inefficient. Currently the only supported option I’m aware of is 
shelling out to create a new jcmd process and parsing the NMT summary 
text output periodically. In older releases, it was possible to bypass 
the jcmd process by self-attaching, but that was limited in jdk9 by 
JDK-8178380 , and still 
required the caller to parse human-readable strings. In fact, attachment 
issues in some JDK/environment combinations make automated attachment 
/dangerous/ in a way that has crashed the JVM — that may be a story for 
another day, but my point is that simple, efficient NMT data collection 
would go a very long way. Many modern observability tools, especially 
those used in container deployments, operate by reading data from within 
the jvm process, and relaying it to a storage system (Prometheus may be 
the most ubiquitous example).


For my use-case, I’d love to have a simple API I could invoke from java 
code to access structured native-memory-tracking data, similar in a way 
to MemoryPoolMXBean for heap pools (although JMX isn’t necessary for me, 
it aligns with other observability APIs in the JDK). Additionally, I’d 
like to provide JFR events to periodically record native memory tracking 
metadata when enabled for better out-of-the-box experience with JMC. 
I’ve begun investigating some options for JDK-8182634 
, but would appreciate 
feedback before I propose any sort of code change.


Thank you all for beginning this discussion, I’m eager to see the ways 
the JDK continues to improve upon observability features, and do what 
small part I can to help!


Carter Kozak