On Tue, 2 Sep 2025 22:46:52 GMT, William Kemper <[email protected]> wrote:
>> ### Summary
>>
>> The new implementation of ObjectCountAfterGC for Shenandoah piggybacks off
>> of the existing marking phases and records strongly marked objects in a
>> histogram. If the event is disabled, the original marking closures are used.
>> When enabled new mark-and-count closures are used by the worker threads.
>> Each worker thread updates its local histogram as it marks an object. These
>> local histograms are merged at the conclusion of the marking phase under a
>> mutex. The event is emitted outside a safepoint. Because (most) Shenandoah's
>> marking is done concurrently, so is the object counting work.
>>
>> ### Performance
>> The performance test were ran using the Extremem benchmark on a default and
>> stress workload. (will edit this section to include data after average time
>> and test for GenShen)
>>
>> #### Default workload:
>> ObjectCountAfterGC disabled (master branch):
>> `[807.216s][info][gc,stats ] Pause Init Mark (G) = 0.003 s
>> (a = 264 us)`
>> `[807.216s][info][gc,stats ] Pause Init Mark (N) = 0.001 s
>> (a = 91 us)`
>> `[807.216s][info][gc,stats ] Concurrent Mark Roots = 0.041 s
>> (a = 4099 us)`
>> `[807.216s][info][gc,stats ] Concurrent Marking = 1.660 s
>> (a = 166035 us)`
>> `[807.216s][info][gc,stats ] Pause Final Mark (G) = 0.004 s
>> (a = 446 us) `
>> `[807.216s][info][gc,stats ] Pause Final Mark (G) = 0.004 s
>> (a = 446 us) `
>> `[807.216s][info][gc,stats ] Pause Final Mark (N) = 0.004 s
>> (a = 357 us)`
>>
>> ObjectCountAfterGC disabled (feature branch):
>> `[807.104s][info][gc,stats ] Pause Init Mark (G) = 0.003 s
>> (a = 302 us)`
>> `[807.104s][info][gc,stats ] Pause Init Mark (N) = 0.001 s
>> (a = 92 us) `
>> `[807.104s][info][gc,stats ] Concurrent Mark Roots = 0.048 s
>> (a = 4827 us)`
>> `[807.104s][info][gc,stats ] Concurrent Marking = 1.666 s
>> (a = 166638 us) `
>> `[807.104s][info][gc,stats ] Pause Final Mark (G) = 0.006 s
>> (a = 603 us)`
>> `[807.104s][info][gc,stats ] Pause Final Mark (N) = 0.005 s
>> (a = 516 us)`
>>
>> ObjectCountAfterGC enabled (feature branch)
>> `[807.299s][info][gc,stats ] Pause Init Mark (G) = 0.002 s
>> (a = 227 us)`
>> `[807.299s][info][gc,stats ] Pause Init Mark (N) = 0.001 s
>> (a = 89 us) `
>> `[807.299s][info][gc,stats ] Concurrent Mark Roots ...
>
> src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 179:
>
>> 177: // Use object counting closure if ObjectCount or ObjectCountAfterGC
>> event is enabled.
>> 178: const bool object_count_enabled =
>> ObjectCountEventSender::should_send_event();
>> 179: if (object_count_enabled &&
>> !ShenandoahHeap::heap()->mode()->is_generational()) {
>
> Can the generational mode support this? It can also perform a _global_
> collection.
It could be possible to extend the object counting closure for GenShen. I would
have to look more deep into where the closure can be used.
> src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 130:
>
>> 128: // Create the KlassInfoTable for Shenandoah only if JFR is
>> enabled.
>> 129: #if INCLUDE_JFR
>> 130: KlassInfoTable cit(false);
>
> Might consider a `CADR` helper class here (Constructor Acquires, Destructor
> Releases). There are a lot of lines of code between the two `set_cit` calls
> (and the second one isn't guarded by `INCLUDE_JFR`).
I'll look into that. Thanks for pointing out the unguarded `set_cit`.
> src/hotspot/share/memory/heapInspection.hpp line 33:
>
>> 31: #include "oops/objArrayOop.hpp"
>> 32: #include "oops/oop.hpp"
>> 33: #include "runtime/mutex.hpp"
>
> Does this need to be in the header?
I had an error in GHA during my early implementation, so I included that
header. I'll remove and test to see if it is needed or not.
> test/jdk/jdk/jfr/event/gc/objectcount/TestObjectCountAfterGCEventWithShenandoah.java
> line 11:
>
>> 9: * & vm.opt.ExplicitGCInvokesConcurrent != true
>> 10: * @library /test/lib /test/jdk
>> 11: * @run main/othervm -XX:+UnlockExperimentalVMOptions
>> -XX:-UseFastUnorderedTimeStamps -XX:+UseShenandoahGC
>> -XX:MarkSweepDeadRatio=0 -XX:-UseCompressedOops
>> -XX:-UseCompressedClassPointers -XX:+IgnoreUnrecognizedVMOptions
>> jdk.jfr.event.gc.objectcount.TestObjectCountAfterGCEventWithShenandoah
>
> Are all of these flags necessary to run this test? Can we pare any
> unnecessary options?
All of the collectors that test for the ObjectCountAfterGC event uses these
flags. I thought it would be best to keep it consistent for the Shenandoah test.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/26977#discussion_r2317633120
PR Review Comment: https://git.openjdk.org/jdk/pull/26977#discussion_r2317630478
PR Review Comment: https://git.openjdk.org/jdk/pull/26977#discussion_r2317545276
PR Review Comment: https://git.openjdk.org/jdk/pull/26977#discussion_r2317555401