Re: [External] : Re: New candidate JEP: 451: Prepare to Disallow the Dynamic Loading of Agents

Ron Pressler Fri, 19 May 2023 14:31:48 -0700

I fully understand why authors of (truly excellent, in your case!) advanced 
serviceability agents would see any feature that could affect their area of 
interest as a problem, but you surely understand that our responsibility is 
toward a far larger ecosystem. Even the ultimate restriction — which is NOT the 
subject of this JEP — would only require the addition of a command line flag to 
retain the current behaviour, but all this JEP does is add a warning. That’s 
the best mechanism that would allow us to estimate the impact of a future 
restriction, so you should welcome it. This JEP disallows nothing.


On a more personal note, my fear is not that profilers (in general, not just 
those that require agents) are now used too much, but that they’re used too 
little, including profilers that don’t require agents at all. I’m afraid we’ll 
find out that even if the usage of such tools were to grow by 10x it would 
still be negligible. I’d love to see — and your help putting such capabilities 
into the JDK would be much appreciated  greater awareness for the importance of 
profilers, and much greater use.

As to JFR — I am aware that many improvement are necessary, but a new 
stacktrace capture mechanism that JFR is about to start using is something that 
we are unlikely to ever manage to expose through an API because it requires 
close collaboration with HotSpot internals. Stay tuned, as it’s really 
exciting! You mention that JVM TI serves multiple uses, and you are absolutely 
right, but the plan is to incorporate the “read only” size into JFR, as it has 
some foundational benefits over JVM TI in addition to the one I mentioned (JFR 
is asynchronous whereas JVM TI is synchronous, which cause a whole lot of pain 
for VM engineers).


— Ron

> On 19 May 2023, at 19:40, Andrei Pangin <andrei.pan...@gmail.com> wrote:
> 
> Hi Ron,
> I reviewed integrity JEPs once again along with this email thread and I think 
> there are several flaws in the proposal that need to be addressed before 
> implementation.
>  
>       • First, the JEP draws equality between an agent and an instrumenting 
> agent, which is not true. Instrumentation is just one of the capabilities 
> that an agent needs to request explicitly by calling JVM TI AddCapabilities 
> function. There are many other read-only features of JVM TI that 
> observability and troubleshooting agents can use without compromising 
> application integrity. Disabling all agents by default just to protect from a 
> few ones that modify application code is like cracking a nut with a 
> sledgehammer, especially when a more fine-grained approach is already built 
> into JVM TI.
> 
>       • JEP states that most serviceability tools do not require dynamic 
> agents. This sounds weird to me. How was that "most" measured? How can half a 
> dozen JDK builtin tools be compared to an infinite number of custom tools 
> that may be and already developed using JVM TI?
> 
>       • JEP assumes that existing JDK tools are enough for troubleshooting. I 
> wish they were. How, for example, you would dump an object graph without 
> sensitive user data from a live service? With JVM TI agent, this is possible. 
> Which builtin tool allows you to find native memory leaks, sources of long 
> time-to-safepoint pauses, map perf counters to Java code? Unfortunately, 
> none. Even worse, when dynamic agents are disabled, development of new custom 
> tools will become meaningless.
> 
>       • You emphasized many times that the proposal to disable dynamic agents 
> appeared years ago. And that's actually the problem with this JEP. It relies 
> on outdated assumptions and has not been adjusted to the modern trends. 
> Technology didn't stay still; new use cases became popular, which this 
> proposal does not take into account. Here are some examples:
>               • Containers became the standard way to ship and deploy 
> applications (btw, a good thing integrity-wise). Container image usually has 
> the minimum amount of software required to run the app: no additional tools, 
> restricted environment. Now consider that I want to monitor the application. 
> Even if I'm allowed to modify the command line, I can't simply add 
> -agentpath, since the agent library is not available in the container. A 
> typical pattern for using serviceability tools with containerized 
> applications is to run a sidecar container that has all required tools and 
> capabilities. How would you suggest attaching a tool to a running container?
>               • In the last couple of years, with the growing popularity of 
> continuous profilers, a number of solutions appeared for system-wide or 
> infrastructure-wide zero-configuration monitoring. The idea is that you 
> install the observability software, and it automatically discovers all 
> supported processes and starts monitoring/profiling them, regardless of how 
> they were deployed. gProfiler, Parca, Pyroscope, just to name a few examples. 
> The keyword here is "zero-configuration". Observability by       default is 
> just as important nowadays as integrity by default.
> 
>       • JEP outlines JFR as a universal solution for profiling, claiming it 
> is "far more efficient than anything" in collecting stack traces. This is not 
> true. Async-profiler (6K stars on GitHub, 700+ forks, more than a million 
> downloads) can collect 1000 execution samples per second per core without 
> significant overhead, thanks to hardware performance counters. Scalability of 
> JFR sampling mechanism is inherently poor: it uses just one dedicated thread 
> to walk through all Java threads in a loop and stop them one by one. JFR does 
> not show non-Java threads in a profile, it is blind to native frames, its 
> notion of thread states is misleading (e.g., Socket.read can spend CPU time 
> in the networking stack or just wait for incoming data, but JFR has no clue). 
> JFR fails to traverse valid Java stacks and silently discards such samples, 
> e.g., you will not see arraycopy in a profile, although it's a common 
> performance bottleneck. JFR is misleading not only in CPU profiling but also 
> in memory profiling, see JDK-8307488. It's utopian to think that JFR can 
> replace external profilers sometime soon - there is no even progress on 
> fixing smaller issues: open bugs hang for years (JDK-8252417, JDK-8153167, 
> JDK-8281677), some are closed as will-not-fix (JDK-8191415). Is it fair to 
> disallow valid usages of profilers at runtime without providing a viable 
> alternative?
> 
>       • You mentioned two goals: 1) disallow libraries to grant themselves 
> superpowers; 2) minimize the impact on serviceability tools that have to be 
> started by a human operator. However, what this JEP actually suggests is the 
> opposite: disabling dynamic loading of agents does not prevent libraries from 
> obtaining superpowers - they can simply call System.load(). At the same time, 
> disabling dynamic loading of agents has a huge impact on serviceability, up 
> to the complete inability to use external tools at runtime. I understand that 
> the plan is to disallow JNI someday too (unless explicitly allowed via a 
> command line option) for the purpose of integrity. Following your goals, it 
> would be more logical to disallow JNI first, as it is an easier way for 
> libraries to break integrity.
>  
> To summarize the above, the current proposal does not seem to me elaborate 
> enough for targeting to JDK 21. I would suggest improving it by 1) 
> actualizing assumptions; 2) taking mentioned use cases into account; 3) 
> providing read-to-use alternatives; 4) matching the plan with the goals.
>  
> Thank you,
> Andrei Pangin
> 
> пт, 19 мая 2023 г. в 15:44, Ron Pressler <ron.press...@oracle.com>:
> Because the discussion of this JEP has veered in many directions, let me 
> summarise where we are:
> 
> This JEP proposes to emit a suppressible warning when a JVM TI or Java agent 
> is loaded into a JVM sometime after startup through the Attach mechanism.
> 
> The warning helps make users aware that an agent has been injected into the 
> JVM and identify deployments that may need adjustment in advance of any 
> future changes to disallow agents from being dynamically loaded without the 
> application's consent. The warning will also let us better judge the impact 
> of such a future change.
> 
> — Ron
> 
> > On 8 May 2023, at 20:17, Mark Reinhold <mark.reinh...@oracle.com> wrote:
> > 
> > https://openjdk.org/jeps/451
> > 
> >  Summary: Issue warnings when agents are loaded dynamically into a
> >  running JVM. These warnings aim to prepare users for a future release
> >  which disallows the dynamic loading of agents by default in order to
> >  improve integrity by default. Serviceability tools that load agents at
> >  startup will not cause warnings to be issued in any release.
> > 
> > - Mark
>

Re: [External] : Re: New candidate JEP: 451: Prepare to Disallow the Dynamic Loading of Agents

Reply via email to