(Posting on the right thread and list now...)
On 6/9/20 2:26 AM, David Holmes wrote:
Hi Jiangli,
> http://cr.openjdk.java.net/~jiangli/8232222/webrev.03/
I'm having trouble keeping track of all the issues, so let me walk
through the changes as I see them:
- InstanceKlass::restore_unshareable_info
For boot loader classes, when no verification is enabled, we mark the
class as linked immediately. By doing this in restore_unshareable_info
there are no races (as the class is not exposed to anyone yet) and it
allows later checks for is_linked to be by-passed (under the
assumption that the class and its supertypes truly are in a state that
appears linked). However, this doesn't generate the JVM TI class
prepare event, and we can't do it here as that would introduce a
number of potential issues with JVM TI.
I see in the bug report some metrics from HelloWorld, but really this
needs to be backed up by a lot more performance measurements to
establish this is actually a worthwhile optimisation.
- SystemDictionary::define_instance_class
This is where we catch up with the JVM TI requirements and immediately
after posting the class load event we post the class prepare event.
As we have discussed, this earlier posting of the event is observable
to a JVMTI agent and although permitted by the specification it is a
change in behaviour that might impact existing agents.
Ioi has raised an issue about there being a race here with the
potential for the event being delivered multiple times. I agree this
code is not adequate:
1718 if (k->is_shared() && k->is_linked()) {
You only want to fire the event for exactly those classes that you
pre-linked, so at a minimum this has to be restricted to boot classes
only. Even then as Ioi points out once the class is exported to the
SystemDictionary and visibly seen to be loaded, then other threads may
race to link it and so have already posted the class prepare event. In
normal linking this race is avoided by the use of the init_lock to
check the linked state, do the linking and issue the class prepare
event, atomically. But your approach cannot do this as it stands, you
would need to add an additional flag to track whether the prepare
event had already be issued.
Thanks to Ioi and David for seeing this race. As I looked at the
change, it looked fairly simple and almost straightforward, but very
scary how these changes interact in such surprising ways. Without this
careful review, these changes cause endless work later on. The area of
class loading and our code for doing so has all sorts of subtle details
that are hard to reason about. I wish this weren't so and we can have
code that we're not afraid of.
The CSR is a nice writeup but I didn't see the race from the CSR either.
We need to take the opportunity to look at this from the top down in a
project like Leyden.
There are still some opportunities to speed up class loading in the
context of CDS and finding places that we can simplify, but this was
alarmingly not simple. I'm grateful to Ioi and David for doing this
work, and yours, for thorougly discussing this change.
Thanks,
Coleen
---
So the change as it stands is incomplete, and introduces a behavioural
change to JVM TI, and the benefits of it have not been clearly
established.
The JBS issue states this is a first step towards pre-initialization
and other optimisations, and it is certainly a pre-requisite to
pre-link before you can pre-initialize, but I don't think pulling out
pre-linking as a separate optimisation is really a worthwhile first
step. I have grave reservations about the ability to pre-initialize in
general and those issues have to be fleshed out in a project like
Leyden. Further, as Coleen points out this pre-linking optimisation is
incompatible with proposed vtable changes. Additionally, this seems it
will be incompatible with changes proposed in Valhalla, as additional
link-time actions will be needed that can't be done at the time of
restore_unshareable_info.
Bottom line for me is that I just don't think this change is worth
pursuing as a stand-alone optimisation at this time. Sorry.
Cheers,
David
-----
On 5/06/2020 8:14 am, Jiangli Zhou wrote:
Hi David,
On Wed, Jun 3, 2020 at 9:59 PM David Holmes <david.hol...@oracle.com>
wrote:
Ioi pointed out that my proposal was incomplete and that it would need
to be more like:
if (is_shared() &&
JvmtiExport::should_post_class_prepare() &&
!BytecodeVerificationLocal &&
loader_data->is_the_null_class_loader_data()) {
Handle h_init_lock(THREAD, init_lock());
ObjectLocker ol(h_init_lock, THREAD, h_init_lock() != NULL);
set_init_state(linked);
>>> call JVMTI
return true;
}
This alleviates any concerns about behavioural changes to JVM TI, and
also allows JVM TI enabled code to partially benefit from the
pre-linking optimisation.
Otherwise I agree with Ioi that any behaviour change to JVM TI needs to
be justified by significant performance gains.
Thanks a lot for the input and suggestion! Locking the init_lock for
the JVMTI ClassPrepre event here sounds ok to me. The ClassDefine is
normally posted before the ClassPrepare. That's why the change was
made in systemDictionary.cpp instead of within
InstanceKlass::restore_unshareable_info() function, to keep the same
events ordering for any given class. I added the 'init_lock' locking
code for post_class_prepare(), and kept the code in
systemDictionary.cpp in webreve.03 below. Not changing the JVMTI
events ordering feels safer to me. Would the following be ok to
everyone?
http://cr.openjdk.java.net/~jiangli/8232222/webrev.03/
I also changed the InstanceKlass::restore_unshareable_info() to set
_init_state via set_init_state API as you suggested. We can get away
without locking the init_lock for setting the flag itself.
Best regards,
Jiangli
David
-----
On 4/06/2020 8:42 am, David Holmes wrote:
Correction ...
On 3/06/2020 5:19 pm, David Holmes wrote:
On 3/06/2020 3:44 pm, Ioi Lam wrote:
On 6/2/20 10:16 PM, David Holmes wrote:
Hi Ioi,
On 3/06/2020 2:55 pm, Ioi Lam wrote:
On 5/27/20 11:13 PM, David Holmes wrote:
Hi Jiangli,
On 28/05/2020 11:35 am, Ioi Lam wrote:
I was going to take the suggestion, but realized that it
would add
unnecessary complications for archived boot classes with class
pre-initialization support. Some agents may set
JvmtiExport::should_post_class_prepare(). It's worthwhile to
support
class pre-init uniformly for archived boot classes with
JvmtiExport::should_post_class_prepare() enabled or disabled.
This would introduce behavioral changes when JVMTI is enabled:
+ The order of JvmtiExport::post_class_prepare is different than
before
+ JvmtiExport::post_class_prepare may be called for a class that
was not called before (if the class is never linked during
run time)
+ JvmtiExport::post_class_prepare was called inside the
init_lock, now it's called outside of the init_lock
I have to say I share Ioi's concerns here. This change will
impact
JVM TI agents in a way we can't be sure of. From a specification
perspective I think we are fine as linking can be lazy or eager,
so there's no implied order either. But this would be a
behavioural change that will be observable by agents. (I'm less
concerned about the init_lock situation as it seems potentially
buggy to me to call out to an agent with the init_lock held in
the
first place! I find it hard to imagine an agent only working
correctly if the init_lock is held.)
David,
The init_lock has a serializing effect. The callback for a
subclass
will not be executed until the callback for its super class has
been finished.
Sorry I don't see that is the case. The init_lock for the subclass
is distinct from the init_lock of the superclass, and linking of
subclasses and superclasses is independent.
In InstanceKlass::link_class_impl, you first link all of your super
classes.
If another thread is already linking your super class, you will
block
on that superclass's init_lock.
The point is that there is already a race in terms of the
execution of
the two callbacks. So while this change can certainly produce a
different result to what would previously be seen, such a result is
already possible in the general case.
Of course, I may be wrong and my analysis may be bogus. But I hope
you can appreciate that this is not going to be a trivial change to
analyze.
Yes I agree. While in general ordering of the class_prepare callbacks
is not guaranteed for independent classes, if a given application
explicitly loads and links classes in a known order then it can
(reasonably) expect its callbacks to execute in that order. If this
change means classes will now be linked in an order independent of
what the normal runtime order would be then that could be a problem
for existing agents.
So where does this leave us? The change is within spec, but could
trigger changes in agent behaviour that we can't really evaluate
a-priori. So as you say we should have a fairly good reason for doing
this. I can easily envisage that pre-linking when no callbacks are
enabled would be a performance boost. But with callbacks enabled and
consuming CPU cycles any benefit from pre-linking could be lost in
the
noise.
What if we did as Ioi suggested and only set the class as linked in
restore_unshareable_info if
!JvmtiExport::should_post_class_prepare();
and in addition in InstanceKlass::link_class_imp we added an
additional check at the start:
// Pre-linking at load time may have been disabled for shared
classes,
// but we may be able to do it now.
if (JvmtiExport::should_post_class_prepare() &&
!BytecodeVerificationLocal &&
loader_data->is_the_null_class_loader_data()) {
_init_state = linked;
}
There should obviously be a check for is_shared() in there as well.
David
-----
?
That avoids the problem of changing the JVM TI callback behaviour,
but
also shortens the link time path when the callbacks are enabled.
Hope I got that right. :)
David
-----
Thanks
- Ioi
David
-----
With the proposed patch, the callback for both the super class and
subclass can proceed in parallel. So if an agent performs class
hierarchy analysis, for example, it may need to perform extra
synchronization.
This is just one example that I can think of. I am sure there are
other issues that we have not thought about.
The fact is we are dealing with arbitrary code in the callbacks,
and we are changing the conditions of how they are called. The
calls happen inside very delicate code (class loading, system
dictionary). I am reluctant to do the due diligence, which is
substantial, of verifying that this is a safe change, unless we
have a really compelling reason to do so.
Thanks
- Ioi