Roger,
You are right that the existing code does not let me avoid the stack
walk. I withdraw the proposed patch.
I think at least part of the problem is the comment on
VM.latestUserDefinedLoader(). That says
/*
* Returns the first user-defined class loader up the execution stack,
* or the platform class loader if only code from the platform or
* bootstrap class loader is on the stack.
*/
I thought that a "user-defined class loader" was a *class loader*
defined by the user. I now think VM.latestUserDefinedLoader returns
the class loader of the first user-defined *method* up the execution
stack, whether that class loader is user-defined or is, for example,
the application class loader. I do not consider the application class
loader to be user-defined, though obviously methods loaded by the
application class loader can be user-defined.
... peter
-----Original Message-----
From: Roger Riggs <[email protected]>
Organization: Java Products Group, Oracle
Date: Tuesday, June 16, 2020 at 7:14 AM
To: "Peter Kessler (Open Source)" <[email protected]>,
"[email protected]" <[email protected]>
Subject: Re: [PATCH] 8246633: Improve the performance of
ObjectInputStream.resolveClass(ObjectStreamClass)
Hi Peter,
I think you've hardwired an assumption about the contents of the stack
into VM.latestUserDefineLoader...
On 6/12/20 7:19 PM, Peter Kessler (Open Source) wrote:
> Roger,
>
> I did think about confining the changes to ObjectInputStream. Maybe I
> did not think hard enough about it.
>
> Keeping a cache of the result of a first stack walk might work for the
> recursive calls. One might still do a useless stack walk for each
> top-level readObject call. No concurrent class loading can invalidate
> the cache for any particular stack walk. But it seems fraught to
> reuse a cached value in the face of concurrent class loading, or in
> the presence of overloaded ObjectInputStream methods that might create
> new ClassLoaders on this stack. I'm sure I don't want to debug that.
>
> My experiments have shown that the average stack walk is about 20
> frames. I do not have the data on how many of those frames are
> recursive calls within ObjectInputStream methods and how many are in
> other code.
I'd be interested in a histogram of the depths and in which applications.
> Dragging in the StackWalker API would further complicate things, in my
> opinion. It might appear to keep the code out of platform native
> code, but we know that walking thread stacks involves native code.
>
> I settled on a system-wide one-bit cache: There have been no
> user-defined ClassLoaders constructed, so I can not find one on any
> particular stack walk. It may not be the best possible solution, but
> it addresses the issue of useless stack walks in the common case where
> there are no user-defined ClassLoaders.
>
> I agree that both ClassLoader and ObjectInputStream are complex. In
> fact, the proposed patch makes no changes to ObjectInputStream. There
> is one line added to each of the system-defined ClassLoaders to
> identify them as not user-defined ClassLoaders, one line in the base
> ClassLoader constructor, and the bulk of the change is around the
> guard in jdk.internal.misc.VM.latestUserDefinedLoader() to avoid the
> stack walk if possible.
The spec and implementation of OIS.resolveClass do not make any assumptions
about whether serialization is invoked from a class on the
PlatformLoader, AppClassLoader,
or another loader.
In the case of a 'system' thread or class calling OIS.readObject, the
change you are proposing
will force class loading in resolveClass to the AppClassLoader.
In VM.latestUserDefinedLoader:390 caught my eye because it forces the
AppClassLoader.
It assumes that scanning the stack would find and return the
AppClassLoader, not so in all cases.
Regards, Roger
>
> ... peter
>
> -----Original Message-----
> From: Roger Riggs <[email protected]>
> Organization: Java Products Group, Oracle
> Date: Friday, June 12, 2020 at 11:50 AM
> To: "Peter Kessler (Open Source)" <[email protected]>,
"[email protected]" <[email protected]>
> Subject: Re: [PATCH] 8246633: Improve the performance of
ObjectInputStream.resolveClass(ObjectStreamClass)
>
> Hi Peter,
>
> The hazard to avoid is the cross package coupling that makes both
> ClassLoader and
> ObjectInputStream more complex; both are more than sufficiently
complex
> already.
>
> Optimizing the implementations make sense if it benefits enough uses
> cases and
> does not make the code harder to maintain and evolve.
>
> Have you explored any alternatives that can be confined to OIS?
> For example, taking advantage the many calls to readObject within
> a single OIS that are nested. Possibly the first (depth = 0) find
user
> class loader
> can be re-used or at least to short circuit the search on subsequent
> nested calls?
> Or caching the classloader at each depth and only need to probe for
it
> if it has
> not already been recorded. The StackWalker API may be useful to
limit
> the depth
> of searches.
>
> Thanks, Roger
>
>
>
> On 6/11/20 2:52 PM, Peter Kessler OS wrote:
> > Hi Roger,
> >
> > Thanks for your comments.
> >
> > I am not knowingly changing the required behavior. All I am
proposing is
> > guarding the thread stack walk if it will be a waste of effort.
If an
> > application or framework has its own ClassLoaders, then the guard
will fail
> > and the thread stack will be walked.
> >
> > The stack walk is important, if there are user-defined
ClassLoaders. A
> > method that uses ObjectInputStream to read objects can not know
what
> > ClassLoader to use to resolve class references in the stream,
without the
> > stack walk. But if there never have been any user-defined
ClassLoaders
> > constructed, then the stack walk can never return a user-defined
> > ClassLoader.
> >
> > If the stream reader is part of some simple application, then it
may spend
> > a significant amount of its time uselessly walking thread stacks.
(If the
> > stream reader is part of some simple application its author may
not know to
> > override ObjectInputStream.resolveClass, or how to determine what
> > ClassLoader should load a class from the stream.) If someone were
using
> > such an application to choose platforms, the Java platform would
be at a
> > disadvantage.
> >
> > Consider a somewhat more complex case, the SPECjbb2015 benchmark,
which
> > claims to model a well-constructed business application on the
Java
> > platform. SPECjbb2015 runs multiple JVMs, exchanging information
via
> > ObjectInputStreams. Profiling shows that it spends more than 5%
of its time
> > walking thread stacks. Yet, in spite of being a complex
application,
> > written by experts in the Java platform, there are no user-defined
> > ClassLoaders. So all the time spent walking stacks is wasted
time, and is
> > one of the reasons the servers fail their service-level
agreements. We can
> > discuss whether SPECjbb2015 is representative of Java
applications. But it
> > is easily available and relatively simple to run. People use
benchmarks
> > like SPECjbb2015 to decide whether to use the Java platform for
particular
> > tasks, and it would be unfortunate if we did not make the Java
platform as
> > good as it could be.
> >
> > ... peter
> >
> > -----Original Message-----
> > From: core-libs-dev <[email protected]> on behalf
of Roger Riggs <[email protected]>
> > Organization: Java Products Group, Oracle
> > Date: Tuesday, June 9, 2020 at 6:43 AM
> > To: "[email protected]"
<[email protected]>
> > Subject: Re: [PATCH] 8246633: Improve the performance of
ObjectInputStream.resolveClass(ObjectStreamClass)
> >
> > Hi Peter,
> >
> > I'd like to understand the scope and impact of the problem
before
> > jumping to a solution.
> > For specific applications, overriding
ObjectInputStream.resolveClass to
> > handle the class
> > lookup can handle the case as expeditiously as is necessary.
> >
> > What application uses cases are impacted and how prevelant
are they?
> > Many, many frameworks and applications create classloaders
for their own
> > purposes and
> > take advantage of the required behavior.
> >
> > Thanks, Roger
> >
> >
> > On 6/4/20 7:08 PM, Peter Kessler OS wrote:
> > > ObjectInputStream.resolveClass(ObjectStreamClass) is specified to
find "the first class loader on the current thread's stack (starting from the currently
executing method) that is neither the platform class loader nor its ancestor." Finding that
class loader is done with a virtual frame walk in JVM_LatestUserDefinedLoader(JNIEnv *env). The
virtual frame walk is expensive. If an application does not define any ClassLoaders, then the
virtual frame walk will always find the system class loader.
> > >
> > > If an application does a significant amount of reading
from an ObjectInputStream, the cost of
ObjectInputStream.resolveClass(ObjectInputStreamClass) can be considerable. In a sample
application, that method (and the methods it calls) have been measured to consume 5+% of
the CPU cycles.
> > >
> > > The proposal is to note the construction of any
user-defined ClassLoader, and if none has been constructed, avoid the virtual frame walk
returning ClassLoader.getSystemClassLoader(). But if any user-defined ClassLoader has
been constructed, use the virtual frame walk to find the first user-defined ClassLoader
on the thread stack.
> > >
> > > User-defined ClassLoaders could be distinguished from
non-user-defined ClassLoaders in several ways. The ClassLoader class hierarchy could be
changed to make non-user-defined ClassLoaders inherit from a marker class. There is
already a jdk.internal.loader.BuiltinClassLoader but it is used for module (and
resource) loading. BuiltinClassLoader is not a super-class of
jdk.internal.reflect.DelegatingClassLoader, which is used to load reflection classes,
and reflection methods are also ignored by the virtual frame walk. ClassLoaders could
have a boolean that distinguished whether they are user-defined or not. That would
require a change to ClassLoader API. There are other ways to distinguish user-defined
ClassLoaders. The proposal is to keep the decoration of the ClassLoaders out of the
instances, so that it is only visible to the code that needs it.
> > >
> > > The proposal is that each non-user-defined ClassLoader
have a static block that registers the class as a non-user-defined ClassLoader. Then in
the base ClassLoader constructor check if the ClassLoader being constructed is a
user-defined ClassLoader and if so set a boolean that acts as the guard on the virtual
frame walk. If additional non-user-defined ClassLoaders are declared in the future
beyond the ones identified in this patch, they can be added to the registry.
> > >
> > > There are currently 4 non-user-defined ClassLoader
classes, so the registry is not difficult to maintain. Nor is the registry difficult to
search when ClassLoaders are constructed. The guard boolean that records whether a
user-defined ClassLoader has been constructed transitions from false to true and never
becomes false again. (An extension to transition the boolean back to false when a
user-defined ClassLoader becomes unreachable is beyond the scope of this proposal.)
> > >
> > > This proposal slows loading of non-user-defined
ClassLoaders by the time it takes to register them. It slows ClassLoader construction
by the time it takes to consult the registry and conditionally set the guard boolean.
At each invocation of ObjectInputStream.resolveClass(ObjectInputStreamClass), the guard
boolean is used to possibly avoid the virtual frame walk.
> > >
> > > Tested with `make run-test-tier1` on Linux (CentOS 7) on
both aarch64 and x86_64. I had one failure on each of the runs
> > >
> > > ACTION: main -- Failed. Execution failed: `main'
threw exception: java.lang.Error: TESTBUG: the library has not been found in
${Repo}/build/linux-aarch64-server-release/images/test/hotspot/jtreg/native
> > > REASON: User specified action: run main/native
GTestWrapper
> > >
> > > on both plaforms. But I get the same failure when I run
the tier1 tests on clones of OpenJDK-15+25 on both aarch64 and x86_64.
> > >
> > > Running a sample (representative?) application with the
Oracle Developer Studio analyzer shows a performance comparison of
> > >
> > > Stack Fragment sorted by metric: Attributed Total
CPU Time
> > > baseline.er experiment.er
> > > Attributed Attributed Name
> > > Total CPU Time Total CPU Time
> > > sec. sec.
> > > ================================== Callers
> > > 3459.210 29.931
java.io.ObjectInputStream.readOrdinaryObject(boolean)
> > > 1139.727 3.532
java.io.ObjectInputStream.readArray(boolean)
> > > 875.262 9.116
java.io.ObjectInputStream.readNonProxyDesc(boolean)
> > >
> > > ================================== Stack Fragment
> > >
java.io.ObjectInputStream.readClassDesc(boolean)
> > >
java.io.ObjectInputStream.readNonProxyDesc(boolean)
> > >
java.io.ObjectInputStream.resolveClass(java.io.ObjectStreamClass)
> > >
java.io.ObjectInputStream.latestUserDefinedLoader()
> > > 4.173 3.953
jdk.internal.misc.VM.latestUserDefinedLoader()
> > >
> > > ================================== Callees
> > > 5470.026 0.
jdk.internal.misc.VM.latestUserDefinedLoader0()
> > > 0. 38.627
java.lang.ClassLoader.getSystemClassLoader()
> > >
> > > The `hg export -g` below is with respect to OpenJDK-15+25.
> > >
> > > Thank you for your review comments. I will need a sponsor
to get this change into the repository.
> > >
> > > ... peter
> > >
> > > # HG changeset patch
> > > # User Peter Kessler <[email protected]>
> > > # Date 1591310467 25200
> > > # Thu Jun 04 15:41:07 2020 -0700
> > > # Node ID 9a39488182c1dfc5bc7bb41d562d7ef16ee657f6
> > > # Parent 90b266a84c06f1b3dc0ed8767856793e8c1c357e
> > > Improve the performance of ObjectInputStream.resolveClass
> > >
> > > diff --git
a/src/java.base/share/classes/java/lang/ClassLoader.java
b/src/java.base/share/classes/java/lang/ClassLoader.java
> > > ---
a/src/java.base/share/classes/java/lang/ClassLoader.java
> > > +++
b/src/java.base/share/classes/java/lang/ClassLoader.java
> > > @@ -385,6 +385,8 @@
> > > }
> > > this.package2certs = new ConcurrentHashMap<>();
> > > this.nameAndId = nameAndId(this);
> > > + /* Note the construction of a ClassLoader. */
> > > + VM.noteClassLoaderConstruction(this.getClass());
> > > }
> > >
> > > /**
> > > diff --git
a/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
b/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > > ---
a/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > > +++
b/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > > @@ -117,6 +117,11 @@
> > > protected Class<?> loadClassOrNull(String cn,
boolean resolve) {
> > > return JLA.findBootstrapClassOrNull(this,
cn);
> > > }
> > > +
> > > + static {
> > > + /* Register this ClassLoader as not a
user-defined ClassLoader. */
> > > +
VM.registerNotUserDefinedClassLoader(BootClassLoader.class);
> > > + }
> > > };
> > >
> > > /**
> > > @@ -127,6 +132,8 @@
> > > static {
> > > if (!ClassLoader.registerAsParallelCapable())
> > > throw new InternalError();
> > > + /* Register this ClassLoader as not a
user-defined ClassLoader. */
> > > +
VM.registerNotUserDefinedClassLoader(PlatformClassLoader.class);
> > > }
> > >
> > > PlatformClassLoader(BootClassLoader parent) {
> > > @@ -142,6 +149,8 @@
> > > static {
> > > if (!ClassLoader.registerAsParallelCapable())
> > > throw new InternalError();
> > > + /* Register this ClassLoader as not a
user-defined ClassLoader. */
> > > +
VM.registerNotUserDefinedClassLoader(AppClassLoader.class);
> > > }
> > >
> > > final URLClassPath ucp;
> > > diff --git
a/src/java.base/share/classes/jdk/internal/misc/VM.java
b/src/java.base/share/classes/jdk/internal/misc/VM.java
> > > --- a/src/java.base/share/classes/jdk/internal/misc/VM.java
> > > +++ b/src/java.base/share/classes/jdk/internal/misc/VM.java
> > > @@ -304,12 +304,44 @@
> > > private static final int
JVMTI_THREAD_STATE_WAITING_INDEFINITELY = 0x0010;
> > > private static final int
JVMTI_THREAD_STATE_WAITING_WITH_TIMEOUT = 0x0020;
> > >
> > > + /** A registry of not user-defined ClassLoaders. */
> > > + private static final List<Class<? extends
ClassLoader>> notUserDefinedClassLoaderRegistry =
> > > +
Collections.synchronizedList(new ArrayList<>());
> > > +
> > > + /** Register a ClassLoader class as being not a
user-defined ClassLoader. */
> > > + public static void
registerNotUserDefinedClassLoader(Class<? extends ClassLoader> classLoaderClass) {
> > > +
notUserDefinedClassLoaderRegistry.add(classLoaderClass);
> > > + }
> > > +
> > > + /** A flag for whether a user-defined ClassLoaders
has been constructed. */
> > > + private static volatile boolean
constructedUserDefinedClassLoaderFlag = false;
> > > +
> > > + /**
> > > + * Note a ClassLoader construction, and record if a
> > > + * user-defined ClassLoader has been constructed.
> > > + */
> > > + public static void noteClassLoaderConstruction(Class<?
extends ClassLoader> classLoaderClass) {
> > > + /* Check if the ClassLoader class not a
user-defined ClassLoader. */
> > > + if
(notUserDefinedClassLoaderRegistry.contains(classLoaderClass)) {
> > > + return;
> > > + }
> > > + constructedUserDefinedClassLoaderFlag = true;
> > > + return;
> > > + }
> > > +
> > > /*
> > > * Returns the first user-defined class loader up
the execution stack,
> > > * or the platform class loader if only code from
the platform or
> > > * bootstrap class loader is on the stack.
> > > */
> > > public static ClassLoader latestUserDefinedLoader() {
> > > + if (!constructedUserDefinedClassLoaderFlag) {
> > > + /*
> > > + * If no user-defined ClassLoader has been
constructed,
> > > + * then I will not find a user-defined
ClassLoader in the stack.
> > > + */
> > > + return ClassLoader.getSystemClassLoader();
> > > + }
> > > ClassLoader loader = latestUserDefinedLoader0();
> > > return loader != null ? loader :
ClassLoader.getPlatformClassLoader();
> > > }
> > > diff --git
a/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
b/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > > ---
a/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > > +++
b/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > > @@ -30,6 +30,7 @@
> > >
> > > import jdk.internal.access.JavaLangAccess;
> > > import jdk.internal.access.SharedSecrets;
> > > +import jdk.internal.misc.VM;
> > >
> > > /** Utility class which assists in calling defineClass()
by
> > > creating a new class loader which delegates to the
one needed in
> > > @@ -73,4 +74,9 @@
> > > DelegatingClassLoader(ClassLoader parent) {
> > > super(parent);
> > > }
> > > +
> > > + static {
> > > + /* Register this ClassLoader as not a
user-defined ClassLoader. */
> > > +
VM.registerNotUserDefinedClassLoader(DelegatingClassLoader.class);
> > > + }
> > > }
> > >
> >
> >
>
>