Re: [External] : Re: JEP draft: Disallow the Dynamic Loading of Agents by Default

Dan Heidinga Mon, 01 May 2023 09:15:48 -0700

On Sun, Apr 30, 2023 at 10:19 AM Ron Pressler <ron.press...@oracle.com>
wrote:


> Hi Dan!
>
> > On 29 Apr 2023, at 03:30, Dan Heidinga <heidi...@redhat.com> wrote:
> >
> > Hi Ron,
> >
> > Thanks for writing up the JEP draft outlining the proposal to disallow
> dynamic loading of agents by default.  The Red Hat Java team has continued
> to discuss this proposal internally and with our stakeholders.
> >
> > While there is a general agreement (or at least acceptance) with the
> overall direction of this proposal, we still see the concerns raised by
> Andrew [0] as needing to be addressed.
> >
> > So let’s start with the high-order bit: timing.
> >
> > The JEP draft states it “intends to adopt the proposal [to disable
> agents by default] six years later, in 2023, for JDK 21.”  We would like to
> see this targeted to JDK 22 instead as the change has not been widely
> evangelized yet and comes as a surprise to many, both to those actively
> developing OpenJDK and to those monitoring its development.
> >
> > We owe it to the community to give this proposal enough bake time,
> especially given that JDK 21 is an LTS release.  Though the official
> position is that LTS releases are no different than any other release, the
> actions of JDK consumers make them special.  Users will be surprised by
> this change when they migrate from JDK 17 to 21.  If we delay till JDK 22,
> we give the ecosystem time to adapt to the change before the next LTS.
> >
> > Additionally, users who have tested with the
> -XX:-EnableDynamicAgentLoading option will have false expectations if they
> validated their use of jcmd to load the agent as the behaviour was not
> correct prior to JDK 21 [1].
> >
> > The next concern is one you call out in the draft that “Java's excellent
> serviceability has long been a source of pride for the platform.”  We
> agree!
> >
> > Java currently has an excellent, prime position in Observability
> capabilities. For better or for worse, there are many Observability tools
> that have relied on dynamic attach to provide the necessary monitoring for
> workloads
> >
> > It’s important we give Java’s monitoring tools sufficient time to
> migrate comfortably without shocking the ecosystem by changing the default
> in an LTS.  By delaying the change till JDK 22, we give the ecosystem 2
> years to migrate and to prepare users for the change.
> >
> > Additionally, this provides the time for Java’s profiling tools to adapt
> as well.  And for the “ad-hoc troubleshooting” tools - like Byteman - to
> retrain their users.
>
> That’s a fair point. Even though the change was announced some years ago,
> some strong encapsulation features had a transition period where they
> emitted warnings before changing defaults. Since we can reasonably expect
> 21 to see relatively high adoption, we could take that opportunity to
> educate more users and only emit a warning when an agent is loaded
> dynamically (otherwise, since many users unfortunately skip versions, they
> would be equally surprised and unprepared at the next version they adopt as
> they would be if the default change were made in 21). Would you find that
> reasonable?
>

This "print a warning" approach makes a lot of sense - as you say, it
educates users of dynamic agents that action will be required while not
impeding the uptake of JDK 21.  It also follows the precedent set by the
--illegal-access option in JDK 9+.  Users who don't want to see the warning
in their logs can specify -XX:+EnableDynamicAgentLoading and are then well
prepared for JDK 22+.  Seems like a win-win approach.


>
> If so, we may perhaps be able to also emit warnings on JNI use in 21, thus
> bringing agents, JNI, and FFM to the same baseline in 21, i.e. they would
> all issue warnings unless sanctioned by the application.
>

I'm still working through the integrity JEP so I'll hold off on responding
regarding JNI for now.


>
> >
> > Finally, while it’s easy to agree with the principle that “the
> application owner is given the final say on whether to allow dynamic
> loading of agents”, the argument can (and should!) be made that those
> application owners have made that final decision by deploying the libraries
> and tools that use dynamic attach.  A JVM command line argument is not the
> only way to express their consent for code to run on their system.
> >
> > For many production uses, the reality is more complicated than a single
> “application owner”. Take a Kubernetes cluster for example.
> >
> > Who is the application owner when deploying to a Kubernetes cluster?
> The dev team that develops the application?  The operations team that
> manages the cluster?  The monitoring team that monitors the applications?
> The Support team that troubleshoots issues with the deployment?
> >
> > We should be careful not to understate or misrepresent the complex web
> of “owners” that are currently able (and, for business reasons, need) to
> apply agents dynamically.  Downplaying the complexity many organizations
> experience when dealing with changes to command line options (as an
> example) weakens the argument for changing today’s status quo.
>
> Right. In this case, “owner” means any person who has been given the
> sufficient OS privileges to attach a dynamic agent (and who then also has
> sufficient privileges to stop or start the process).
>
> Because the ideal is not to disrupt tools at all but rather to prevent
> libraries from escalating their powers without the application’s knowledge
> and consent, we’ve begun to explore means other than the flag to allow a
> tool to load an agent at runtime. Two ideas we’ve had so far are a
> challenge-response mechanism that would verify there’s a person in the loop
> or issuing certificates to tools that would be used by the VM to verify
> that it is an approved tool that’s loading an agent (revoking certificates
> that find their way to libraries). These mechanisms are, however, complex,
> so they (or perhaps some other alternative) may appear only later.
>

I'm still a little dubious of the distinction between tools and libraries
being drawn here.  In both cases, a responsible person has chosen to deploy
the library or the tool in their environment.  There's a human in the loop,
albeit at different stages as one decision is made during development and
the other during deployment.  While I understand the benefits to the
runtime in not allowing dynamic attach as J9 operated in that model (or
with limited capabilities for dynamically attached agents) for many years,
I also saw the frequent requests to enable more dynamic capabilities for
such agents from both vendors and users.  The fact that users were
frequently requesting it - even though attaching at launch would have
resolved their issue - was surprising given how unhappy they were with that
solution even though it resolved the issue.


> >
> > Dynamically attached agents have been a “superpower” of Java and their
> use has greatly benefited the ecosystem as they’ve helped service and
> support use cases that otherwise aren’t possible, as they helped propel
> Java to the forefront of the Observability tooling, and allowed many other
> useful libraries to be developed.
> >
> > Let’s delay flipping the default until JDK 22 to give the breadth of the
> ecosystem time to absorb this change.
>
> Very well. If a warning is acceptable, we can do that in 21 and delay the
> default change to 22.
>

This gets a +1 from me.


>
> — Ron
>
> P.S.
>
> > We also know that in many cases customers and users may not be in a
> position to modify startup scripts (e.g. even to add in an extra parameter)
> as to do so may invalidate support contracts, etc.
>
> Could you expand more on that? Even if the default change happens in 22,
> it would not apply retroactively. Upgrading to a new JDK requires changing
> startup scripts, as does adding/upgrading libraries, which happens at least
> as frequently as upgrading a JDK version. How can a Java application be
> developed and deployed without the ability to change the command line? I
> can’t see how an application is expected to change its runtime version and
> yet not be able to change the command line? I mean, I can imagine setups
> where that could sometimes *happen* to work, but not a way for this to be
> *expected* to work. Certainly since the JRE was removed it’s been the
> assumption that upgrading a JDK version may require changing the command
> line.
>

Users believed - rightly or wrongly - that some applications restricted the
set of options that could be modified when deploying some Java-based
applications if they wanted support.  I don't have more specifics here but
this concern was raised more than once.  Setting the heap size was OK,
modifying other -XX options was considered not OK.

Historically, there has also been massive resistance to changes to command
lines due to the complexity in updating launchers, launch scripts, etc.
Applications that provide launcher scripts that must work across different
Java releases have struggled with detecting the correct version of Java to
set the options.  The difficulty with command line options is why things
like "-XX:+IgnoreUnrecognizedVMOptions" exist.  Since the new option has
existed since JDK 9, only those supporting both JDK 8 & a newer release
should experience this kind of issue.

--Dan


>
> There are more important mechanisms than loading agents dynamically that
> require setting VM options, such as selecting a GC/heap configuration
> tailored to the application’s particular needs. Even in third-party hosting
> situations, the applications needs some level of control over the command
> line and the host will appreciate more control that allows it to select
> what capabilities it offers hosted applications.
>
>
>

Re: [External] : Re: JEP draft: Disallow the Dynamic Loading of Agents by Default

Reply via email to