Hi David,

On 2019-07-23 09:34, David Holmes wrote:
Hi Erik,

On 23/07/2019 5:10 pm, Erik Österlund wrote:
Hi Martin,

1) In the x86_64 assembly, you can combine the movl; testl; into a single test instruction with one memory operand to the counter, and one immediate zero.

2) If libjvm.so maps in far away, then the movl taking an ExternalAddress, will actually scratch rscratch1, which is r10. That will clobber the rcounter, and will at best cause all loads to take the slow path. However, in the worst case, the subsequent verification might say that the load was okay, even though it was not.

I was secretly hoping to never have to touch fast JNI getfield again, because it is so shady, and the odd cases are very hard to test, making it so easy to mess up. The ForceUnreachable JVM flag might be useful in checking if a solution works also when rscratch1 gets clobbered when referencing JVM symbols that are now "far away".

The subtle issue of referencing JVM symbols that can be far away, suddenly clobbering r10, has bitten us many times. Perhaps it should be made more explicit somehow. But that's a separate issue.

Too subtle for me. Is this issue written up anywhere? How do we know what sequences are susceptible to this problem? How do we know when the problem actually occurs? What is the fix?

1) Is this documented: Nope, buried deep in the code, where you least expect to find it. A whole bunch of code asks if Assembler::reachable(AddressLiteral adr), and depending on the answer either perform a non-clobbering or a clobbering variation of the logical instruction. This is precisely what happens with mov32:

void MacroAssembler::mov32(AddressLiteral dst, Register src) {
  if (reachable(dst)) {
    movl(as_Address(dst), src);
  } else {
    lea(rscratch1, dst); <-------------------------------------- this is what will make things awkward
    movl(Address(rscratch1, 0), src);
  }
}

2) How do we know we are in trouble: Any macro assembler call that takes an ExternalAddress parameter, might have to clobber r10. Rarely, when the stars align, to make sure testing won't catch it. 3) When does it actually occur? When libjvm.so is mapped in further away from the code cache than reachable in a signed integer, e.g. ~2 GB apart in virtual address space. Tends to happen more often on windows it seems. 4) The fix that has been sadly applied all over the VM is to deal with r10 being clobbered across such macro assembler instructions, either by moving it to a place where it may safely be clobbered, or by stashing away r10 across the macro assembler call and then restore it after. And every now and then we forget this implicit side effect and things blow up instead.



Also, I noticed that the counter that we are checking if it has changed, is a 32 bit signed integer. They can actually wrap around, which is undefined behaviour at best, and will make these tests fail in the worst case. When we don't want counters to overflow, we use 64 bit integers.

Are you referring to the field-access counter? Changing that from a 32-bit to 64-bit value seems somewhat out-of-scope for the current change, and may also have issues on 32-bit systems.

True. That might be okay the way it is.

Thanks,
/Erik

Thanks,
David
-----

Thanks,
/Erik

On 2019-07-22 17:39, Doerr, Martin wrote:
Hi David and Erik,

I've tried to add the capability "can_generate_field_access_events" during live phase and got "AddCapabilities failed with error 98" which is "JVMTI_ERROR_NOT_AVAILABLE". So hotspot does not support switching it on during live phase.

Hotspot initializes "can_generate_field_modification_events" during "init_onload_solo_capabilities". As the name tells, it is implemented as an "onload" capability.

So the VM works as expected with and without my change.

Can I add you as reviewers?
If yes, which parts did you review (x86, SPARC, shared code)?

Thanks and best regards,
Martin


-----Original Message-----
From: Doerr, Martin
Sent: Freitag, 19. Juli 2019 13:11
To: David Holmes <david.hol...@oracle.com>; hotspot-runtime-
d...@openjdk.java.net; serviceability-dev@openjdk.java.net; Erik Osterlund
<erik.osterl...@oracle.com>
Subject: RE: RFR(M): 8227680: FastJNIAccessors: Check for JVMTI field access
event requests at runtime

Hi David,

thanks for elaborating on the capability enablement.
With respect to "AddCapabilities", I've only found "Typically this function is used in the OnLoad function. Some virtual machines may allow a limited set
of capabilities to be added in the live phase." in the spec [1].
I don't know which ones are supposed to be part of this "limited set of
capabilities".
As you already explained, adding the capability for field access events in the
live phase does obviously not work for hotspot.
The interpreter has the same issue.

Best regards,
Martin


[1]
https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#AddCapa
bilities


-----Original Message-----
From: David Holmes <david.hol...@oracle.com>
Sent: Freitag, 19. Juli 2019 02:30
To: Doerr, Martin <martin.do...@sap.com>; hotspot-runtime-
d...@openjdk.java.net; serviceability-dev@openjdk.java.net; Erik
Osterlund
<erik.osterl...@oracle.com>
Subject: Re: RFR(M): 8227680: FastJNIAccessors: Check for JVMTI field
access
event requests at runtime

Hi Martin,

On 18/07/2019 8:01 pm, Doerr, Martin wrote:
Hi David and Erik,

thank you for looking at my proposal.

If you try to use fast field accessors when you have to post the field access event then how can you safely go off into a JVM TI event callback
??

We speculatively load the field and check afterwards if we can use this
loaded value.
It is safe to use it if there was no safepoint and no JVMTI event was
requested.
Otherwise, we simply discard the (possibly) loaded value and load it again
in the slow path where we do all the synchronization and event posting.

Thanks for clarifying for me. That is all fine then.

The dynamics of this still concern me, but those concerns are also
present in the existing code. Currently we don't use the quick accessors if JvmtiExport::can_post_field_access() is true during VM startup - this is a one-of initialization check that sets the use of fast accessors for
the lifetime of the JVM. But that is set between the early-start and
start VM events, before the live-phase. But AFAICS the capability for
can_post_field_access can be set or cleared dynamically during the live
phase, thus invalidating the original decision on whether to use fast
accessors or not. With your changes the state of can_post_field_access
is still captured during VM initialization so again the decision to
check for a field access watch is hard-wired for the lifetime of the VM.
But once installed that check allows for use of the fast-path if no
actual watches are set - which is the whole point of this enhancement. So the issue with both old and new code is that if the capability is not present at VM startup the VM will be configured to always use the fast path, even if the capability (and field access watches) are added later.

Thanks,
David

@Erik:
Thanks for your proposal to change the function pointers. I'll look into
that.

Best regards,
Martin


-----Original Message-----
From: David Holmes <david.hol...@oracle.com>
Sent: Donnerstag, 18. Juli 2019 06:39
To: Doerr, Martin <martin.do...@sap.com>; hotspot-runtime-
d...@openjdk.java.net; serviceability-dev@openjdk.java.net
Subject: Re: RFR(M): 8227680: FastJNIAccessors: Check for JVMTI field
access
event requests at runtime

Hi Martin,

I need to think about this some more. A critical property of the fast field accessors are that they are trivial and completely safe. They are
complicated by the need to check if a GC may have happened while we
directly read the field.

If you try to use fast field accessors when you have to post the field access event then how can you safely go off into a JVM TI event callback
??

Thanks,
David

On 16/07/2019 11:31 pm, Doerr, Martin wrote:
Hi,

the current implementation of FastJNIAccessors ignores the flag -
XX:+UseFastJNIAccessors when the JVMTI capability
"can_post_field_access" is enabled.
This is an unnecessary restriction which makes field accesses
(Get<Type>Field) from native code slower when a JVMTI agent is
attached
which enables this capability.
A better implementation would check at runtime if an agent actually
wants
to receive field access events.

Note that the bytecode interpreter already uses this better
implementation by checking if field access watch events were
requested
(JvmtiExport::_field_access_count != 0).

I have implemented such a runtime check on all platforms which
currently
support FastJNIAccessors.

My new jtreg test runtime/jni/FastGetField/FastGetField.java contains
a
micro benchmark:
test-


support/jtreg_test_hotspot_jtreg_runtime_jni_FastGetField/runtime/jni/Fa
stGetField/FastGetField.jtr
shows the duration of 10000 iterations with and without
UseFastJNIAccessors (JVMTI agent gets attached in both runs).
My Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz needed 4.7ms with
FastJNIAccessors and 11.2ms without it.

Webrev:



http://cr.openjdk.java.net/~mdoerr/8227680_FastJNIAccessors/webrev.00/

We have run the test on 64 bit x86 platforms, SPARC and aarch64.
(FastJNIAccessors are not yet available on PPC64 and s390. I'll
contribute
them later.)
My webrev contains 32 bit implementations for x86 and arm, but
completely untested. It'd be great if somebody could volunteer to
review
and test these platforms.

Please review.

Best regards,
Martin


Reply via email to