Hi David,
On 2019-07-23 09:34, David Holmes wrote:
Hi Erik,
On 23/07/2019 5:10 pm, Erik Österlund wrote:
Hi Martin,
1) In the x86_64 assembly, you can combine the movl; testl; into a
single test instruction with one memory operand to the counter, and
one immediate zero.
2) If libjvm.so maps in far away, then the movl taking an
ExternalAddress, will actually scratch rscratch1, which is r10. That
will clobber the rcounter, and will at best cause all loads to take
the slow path. However, in the worst case, the subsequent
verification might say that the load was okay, even though it was not.
I was secretly hoping to never have to touch fast JNI getfield again,
because it is so shady, and the odd cases are very hard to test,
making it so easy to mess up. The ForceUnreachable JVM flag might be
useful in checking if a solution works also when rscratch1 gets
clobbered when referencing JVM symbols that are now "far away".
The subtle issue of referencing JVM symbols that can be far away,
suddenly clobbering r10, has bitten us many times. Perhaps it should
be made more explicit somehow. But that's a separate issue.
Too subtle for me. Is this issue written up anywhere? How do we know
what sequences are susceptible to this problem? How do we know when
the problem actually occurs? What is the fix?
1) Is this documented: Nope, buried deep in the code, where you least
expect to find it. A whole bunch of code asks if
Assembler::reachable(AddressLiteral adr), and depending on the answer
either perform a non-clobbering or a clobbering variation of the logical
instruction. This is precisely what happens with mov32:
void MacroAssembler::mov32(AddressLiteral dst, Register src) {
if (reachable(dst)) {
movl(as_Address(dst), src);
} else {
lea(rscratch1, dst); <-------------------------------------- this
is what will make things awkward
movl(Address(rscratch1, 0), src);
}
}
2) How do we know we are in trouble: Any macro assembler call that takes
an ExternalAddress parameter, might have to clobber r10. Rarely, when
the stars align, to make sure testing won't catch it.
3) When does it actually occur? When libjvm.so is mapped in further away
from the code cache than reachable in a signed integer, e.g. ~2 GB apart
in virtual address space. Tends to happen more often on windows it seems.
4) The fix that has been sadly applied all over the VM is to deal with
r10 being clobbered across such macro assembler instructions, either by
moving it to a place where it may safely be clobbered, or by stashing
away r10 across the macro assembler call and then restore it after. And
every now and then we forget this implicit side effect and things blow
up instead.
Also, I noticed that the counter that we are checking if it has
changed, is a 32 bit signed integer. They can actually wrap around,
which is undefined behaviour at best, and will make these tests fail
in the worst case. When we don't want counters to overflow, we use 64
bit integers.
Are you referring to the field-access counter? Changing that from a
32-bit to 64-bit value seems somewhat out-of-scope for the current
change, and may also have issues on 32-bit systems.
True. That might be okay the way it is.
Thanks,
/Erik
Thanks,
David
-----
Thanks,
/Erik
On 2019-07-22 17:39, Doerr, Martin wrote:
Hi David and Erik,
I've tried to add the capability "can_generate_field_access_events"
during live phase and got "AddCapabilities failed with error 98"
which is "JVMTI_ERROR_NOT_AVAILABLE". So hotspot does not support
switching it on during live phase.
Hotspot initializes "can_generate_field_modification_events" during
"init_onload_solo_capabilities". As the name tells, it is
implemented as an "onload" capability.
So the VM works as expected with and without my change.
Can I add you as reviewers?
If yes, which parts did you review (x86, SPARC, shared code)?
Thanks and best regards,
Martin
-----Original Message-----
From: Doerr, Martin
Sent: Freitag, 19. Juli 2019 13:11
To: David Holmes <david.hol...@oracle.com>; hotspot-runtime-
d...@openjdk.java.net; serviceability-dev@openjdk.java.net; Erik
Osterlund
<erik.osterl...@oracle.com>
Subject: RE: RFR(M): 8227680: FastJNIAccessors: Check for JVMTI
field access
event requests at runtime
Hi David,
thanks for elaborating on the capability enablement.
With respect to "AddCapabilities", I've only found "Typically this
function is
used in the OnLoad function. Some virtual machines may allow a
limited set
of capabilities to be added in the live phase." in the spec [1].
I don't know which ones are supposed to be part of this "limited
set of
capabilities".
As you already explained, adding the capability for field access
events in the
live phase does obviously not work for hotspot.
The interpreter has the same issue.
Best regards,
Martin
[1]
https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#AddCapa
bilities
-----Original Message-----
From: David Holmes <david.hol...@oracle.com>
Sent: Freitag, 19. Juli 2019 02:30
To: Doerr, Martin <martin.do...@sap.com>; hotspot-runtime-
d...@openjdk.java.net; serviceability-dev@openjdk.java.net; Erik
Osterlund
<erik.osterl...@oracle.com>
Subject: Re: RFR(M): 8227680: FastJNIAccessors: Check for JVMTI field
access
event requests at runtime
Hi Martin,
On 18/07/2019 8:01 pm, Doerr, Martin wrote:
Hi David and Erik,
thank you for looking at my proposal.
If you try to use fast field accessors when you have to post the
field
access event then how can you safely go off into a JVM TI event
callback
??
We speculatively load the field and check afterwards if we can
use this
loaded value.
It is safe to use it if there was no safepoint and no JVMTI event
was
requested.
Otherwise, we simply discard the (possibly) loaded value and load
it again
in the slow path where we do all the synchronization and event
posting.
Thanks for clarifying for me. That is all fine then.
The dynamics of this still concern me, but those concerns are also
present in the existing code. Currently we don't use the quick
accessors
if JvmtiExport::can_post_field_access() is true during VM startup
- this
is a one-of initialization check that sets the use of fast
accessors for
the lifetime of the JVM. But that is set between the early-start and
start VM events, before the live-phase. But AFAICS the capability for
can_post_field_access can be set or cleared dynamically during the
live
phase, thus invalidating the original decision on whether to use fast
accessors or not. With your changes the state of
can_post_field_access
is still captured during VM initialization so again the decision to
check for a field access watch is hard-wired for the lifetime of
the VM.
But once installed that check allows for use of the fast-path if no
actual watches are set - which is the whole point of this
enhancement.
So the issue with both old and new code is that if the capability
is not
present at VM startup the VM will be configured to always use the
fast
path, even if the capability (and field access watches) are added
later.
Thanks,
David
@Erik:
Thanks for your proposal to change the function pointers. I'll
look into
that.
Best regards,
Martin
-----Original Message-----
From: David Holmes <david.hol...@oracle.com>
Sent: Donnerstag, 18. Juli 2019 06:39
To: Doerr, Martin <martin.do...@sap.com>; hotspot-runtime-
d...@openjdk.java.net; serviceability-dev@openjdk.java.net
Subject: Re: RFR(M): 8227680: FastJNIAccessors: Check for JVMTI
field
access
event requests at runtime
Hi Martin,
I need to think about this some more. A critical property of the
fast
field accessors are that they are trivial and completely safe.
They are
complicated by the need to check if a GC may have happened while we
directly read the field.
If you try to use fast field accessors when you have to post the
field
access event then how can you safely go off into a JVM TI event
callback
??
Thanks,
David
On 16/07/2019 11:31 pm, Doerr, Martin wrote:
Hi,
the current implementation of FastJNIAccessors ignores the flag -
XX:+UseFastJNIAccessors when the JVMTI capability
"can_post_field_access" is enabled.
This is an unnecessary restriction which makes field accesses
(Get<Type>Field) from native code slower when a JVMTI agent is
attached
which enables this capability.
A better implementation would check at runtime if an agent
actually
wants
to receive field access events.
Note that the bytecode interpreter already uses this better
implementation by checking if field access watch events were
requested
(JvmtiExport::_field_access_count != 0).
I have implemented such a runtime check on all platforms which
currently
support FastJNIAccessors.
My new jtreg test runtime/jni/FastGetField/FastGetField.java
contains
a
micro benchmark:
test-
support/jtreg_test_hotspot_jtreg_runtime_jni_FastGetField/runtime/jni/Fa
stGetField/FastGetField.jtr
shows the duration of 10000 iterations with and without
UseFastJNIAccessors (JVMTI agent gets attached in both runs).
My Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz needed 4.7ms with
FastJNIAccessors and 11.2ms without it.
Webrev:
http://cr.openjdk.java.net/~mdoerr/8227680_FastJNIAccessors/webrev.00/
We have run the test on 64 bit x86 platforms, SPARC and aarch64.
(FastJNIAccessors are not yet available on PPC64 and s390. I'll
contribute
them later.)
My webrev contains 32 bit implementations for x86 and arm, but
completely untested. It'd be great if somebody could volunteer to
review
and test these platforms.
Please review.
Best regards,
Martin