Hi Martin,
Your change works Ok on arm32 with the minor correction. See the patch
attached.
thanks,
Boris
On 16.07.2019 16:31, Doerr, Martin wrote:
Hi,
the current implementation of FastJNIAccessors ignores the flag -XX:+UseFastJNIAccessors
when the JVMTI capability "can_post_field_access" is enabled.
This is an unnecessary restriction which makes field accesses (Get<Type>Field)
from native code slower when a JVMTI agent is attached which enables this capability.
A better implementation would check at runtime if an agent actually wants to
receive field access events.
Note that the bytecode interpreter already uses this better implementation by
checking if field access watch events were requested
(JvmtiExport::_field_access_count != 0).
I have implemented such a runtime check on all platforms which currently
support FastJNIAccessors.
My new jtreg test runtime/jni/FastGetField/FastGetField.java contains a micro
benchmark:
test-support/jtreg_test_hotspot_jtreg_runtime_jni_FastGetField/runtime/jni/FastGetField/FastGetField.jtr
shows the duration of 10000 iterations with and without UseFastJNIAccessors
(JVMTI agent gets attached in both runs).
My Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz needed 4.7ms with FastJNIAccessors
and 11.2ms without it.
Webrev:
http://cr.openjdk.java.net/~mdoerr/8227680_FastJNIAccessors/webrev.00/
We have run the test on 64 bit x86 platforms, SPARC and aarch64.
(FastJNIAccessors are not yet available on PPC64 and s390. I'll contribute them
later.)
My webrev contains 32 bit implementations for x86 and arm, but completely
untested. It'd be great if somebody could volunteer to review and test these
platforms.
Please review.
Best regards,
Martin
--- a/src/hotspot/cpu/arm/jniFastGetField_arm.cpp 2019-07-26 13:29:34.569851539 +0300
+++ b/src/hotspot/cpu/arm/jniFastGetField_arm.cpp 2019-07-26 13:31:34.441884864 +0300
@@ -32,7 +32,7 @@
#define __ masm->
-#define BUFFER_SIZE 96
+#define BUFFER_SIZE 120
address JNI_FastGetField::generate_fast_get_int_field0(BasicType type) {
const char* name = NULL;
@@ -114,7 +114,7 @@
if (JvmtiExport::can_post_field_access()) {
// Using barrier to order wrt. JVMTI check and load of result.
- __ membar(Assembler::LoadLoad, Rtmp1);
+ __ membar(MacroAssembler::LoadLoad, Rtmp1);
// Check to see if a field access watch has been set before we
// take the fast path.
@@ -191,7 +191,7 @@
if (JvmtiExport::can_post_field_access()) {
// Order JVMTI check and load of result wrt. succeeding check.
- __ membar(Assembler::LoadLoad, Rtmp2);
+ __ membar(MacroAssembler::LoadLoad, Rtmp2);
__ ldr_s32(Rsafept_cnt2, Address(Rsafepoint_counter_addr));
} else {
// Address dependency restricts memory access ordering. It's cheaper than explicit LoadLoad barrier