Thanks, Erik.

Is there any chance you could point me toward the documentation for the 
jdk.Safepoint* events? It's difficult to tell at a glance if the sum of the 
differences between the safepoint begin and end times is equivalent to 
hotspotRuntimeManagementBean.getTotalSafepointTime(). Does that include the 
sync time (which appears to have its own SafepointStateSynchronization event)? 
I don't understand what beginSafepointEvent.getDuration() means in this 
context, is sync time included only in the safepoint sync event, duration of 
the SafepointBegin event, or time between begin and end event end times?

In logging framework code we've found that converting between time objects and 
numeric values doesn't always (on some jres ever?) optimize away Instant 
allocations. The default JFR configuration appears to set time thresholds for 
safepoint events which I'd expect reduces overhead, but would prevent us from 
accumulating data when there are frequent, quick safepoints. Otherwise the cost 
is likely to be much greater than the old approach.
In a similar vein, is there any way for us to accumulate this data without 
dedicating an entire OS thread to the RecordingStream?

Sorry for the barrage of questions, I appreciate your help!
Carter Kozak

On Wed, Jun 16, 2021, at 15:44, Erik Gahlin wrote:
> It's possible to access safepoint information using JFR, for example:
> 
> try (RecordingStream r = new RecordingStream()) {
>   r.enable("jdk.SafepointBegin");
>   r.enable("jdk.SafepointEnd");
>   r.onEvent("jdk.SafepointBegin", e -> System.out.println("begin: " + 
> e.getEndTime()));
>   r.onEvent("jdk.SafepointEnd", e -> System.out.println("end: " + 
> e.getEndTime()));
>   r.start();
> }
> 
> Erik
> 
> On 2021-06-16, 21:12, "serviceability-dev on behalf of Carter Kozak" 
> <serviceability-dev-r...@openjdk.java.net 
> <mailto:serviceability-dev-retn%40openjdk.java.net> on behalf of 
> cko...@ckozak.net <mailto:ckozak%40ckozak.net>> wrote:
> 
>     As java 16 and beyond lock down access to internal components by default, 
> it can be difficult to produce Prometheus-style metrics describing 
> application safepoints. I’ve been monitoring these metrics so that I can be 
> alerted when an application spends more than ~10% of time in safepoints for 
> some duration, because it means that something has gone wrong: Most often GC 
> spirals, however excessive thread dumps, deadlock checks, etc can contribute. 
> This has been one of the most meaningful tools in my arsenal to detect 
> general JVM badness, however there doesn’t seem to be a way to access the 
> data in newer JREs without allowing access to internal components.
>     
>     Previously I was able to use something along these lines, which required 
> legacy sun.management component access.
>     
>     sun.management.HotspotRuntimeMBean hotspotRuntimeManagementBean = 
> sun.management.ManagementFactoryHelper.getHotspotRuntimeMBean();
>     long totalSafepointTimeMillis = 
> hotspotRuntimeManagementBean.getTotalSafepointTime();
>     
>     Before I get ahead of myself, I’d like to confirm that I haven’t missed a 
> supported pathway to access safepoint time. If my read is correct and there’s 
> no way to access this information from inside the running JVM, would it be a 
> a reasonable addition to the public API?
>     
>     Thanks,
>     Carter Kozak
>     
> 
> 

Reply via email to