Re: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2]

2024-04-02 Thread Vladimir Ivanov
On Mon, 1 Apr 2024 21:07:31 GMT, Vladimir Kozlov  wrote:

>> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE 
>> [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365)
>>  which was used for AOT [JEP 295](https://openjdk.org/jeps/295) 
>> implementation in JDK 9. The code was left in HotSpot assuming it will help 
>> in a future. But during work on Leyden we decided to not use it. In Leyden 
>> cached compiled code will be restored in CodeCache as normal nmethods: no 
>> need to change VM's runtime and GC code to process them.
>> 
>> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce 
>> header size in separate changes. In these changes I did simple fields 
>> reordering to keep small (1 byte) fields together.
>> 
>> I do not see (and not expected) performance difference with these changes.
>> 
>> Tested tier1-5, xcomp, stress. Running performance testing.
>> 
>> I need help with testing on platforms which Oracle does not support.
>
> Vladimir Kozlov has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Removed not_used state of nmethod

Nice cleanup! Overall, looks very good.

What about `CompiledMethod_lock`? There's no `CompiledMethod` anymore, but the 
lock name still refers to it.

-

Marked as reviewed by vlivanov (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/18554#pullrequestreview-1975392018


Re: RFR: 8324241: Always record evol_method deps to avoid excessive method flushing [v3]

2024-01-23 Thread Vladimir Ivanov
On Tue, 23 Jan 2024 19:00:51 GMT, Volker Simonis  wrote:

>> Currently we don't record dependencies on redefined methods (i.e. 
>> `evol_method` dependencies) in JIT compiled methods if none of the 
>> `can_redefine_classes`, `can_retransform_classes` or 
>> `can_generate_breakpoint_events` JVMTI capabalities is set. This means that 
>> if a JVMTI agent which requests one of these capabilities is dynamically 
>> attached, all the methods which have been JIT compiled until that point, 
>> will be marked for deoptimization and flushed from the code cache. For 
>> large, warmed-up applications this mean deoptimization and instant 
>> recompilation of thousands if not then-thousands of methods, which can lead 
>> to dramatic performance/latency drop-downs for several minutes.
>> 
>> One could argue that dynamic agent attach is now deprecated anyway (see [JEP 
>> 451: Prepare to Disallow the Dynamic Loading of 
>> Agents](https://openjdk.org/jeps/451)) and this problem could be solved by 
>> making the recording of `evol_method` dependencies dependent on the new 
>> `-XX:+EnableDynamicAgentLoading` flag isntead of the concrete JVMTI 
>> capabilities (because the presence of the flag indicates that an agent will 
>> be loaded eventually).
>> 
>> But there a single, however important exception to this rule and that's JFR. 
>> JFR is advertised as low overhead profiler which can be enabled in 
>> production at any time. However, when JFR is started dynamically (e.g. 
>> through JCMD or JMX) it will silently load a HotSpot internl JVMTI agent 
>> which requests the `can_retransform_classes` and retransforms some classes. 
>> This will inevitably trigger the deoptimization of all compiled methods as 
>> described above.
>> 
>> I'd therefor like to propose to *always* and unconditionally record 
>> `evol_method` dependencies in JIT compiled code by exporting the relevant 
>> properties right at startup in `init_globals()`:
>> ```c++
>>  jint init_globals() {
>>management_init();
>>JvmtiExport::initialize_oop_storage();
>> +#if INCLUDE_JVMTI
>> +  JvmtiExport::set_can_hotswap_or_post_breakpoint(true);
>> +  JvmtiExport::set_all_dependencies_are_recorded(true);
>> +#endif
>> 
>> 
>> My measurements indicate that the overhead of doing so is minimal (around 1% 
>> increase of nmethod size) and justifies the benefit. E.g. a Spring Petclinic 
>> application started with `-Xbatch -XX:+UnlockDiagnosticVMOptions 
>> -XX:+LogCompilation` compiles about ~11500 methods (~9000 with C1 and ~2500 
>> with C2) resulting in an aggregated nmethod size of around ~40bm. 
>> Additionally recording `evol_method` dependencies only increases this size 
>> be about 400kb
>
> Volker Simonis has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Updated option description and assertion based on review feedback

I support keeping the logic under a flag. I have some concerns about 
unconditionally turning it on. 

I expect significantly higher footprint overhead when an application has plenty 
of tiny methods and deep inlining trees. And java.lang.invoke implementation 
pushes it even further (with arbitrarily deep MethodHandle trees and 
unconditional inlining through them), so heavy users of MethodHandle API should 
experience higher overheads when evol dependencies are recorded.

I suggest to make the flag experimental. Once JFR implementation is improved, 
it can be superseded by `-XX:+EnableDynamicAgentLoading` check.

-

PR Comment: https://git.openjdk.org/jdk/pull/17509#issuecomment-1906824455


Re: RFR: 8304303: implement VirtualThread class notifyJvmti methods as C2 intrinsics [v4]

2023-03-17 Thread Vladimir Ivanov
On Fri, 17 Mar 2023 10:31:46 GMT, Serguei Spitsyn  wrote:

>> This is needed for performance improvements in support of virtual threads.
>> The update includes the following:
>> 
>> 1. Refactored the `VirtualThread` native methods:
>> `notifyJvmtiMountBegin` and `notifyJvmtiMountEnd`  => 
>> `notifyJvmtiMount`
>> `notifyJvmtiUnmountBegin` and `notifyJvmtiUnmountEnd` => 
>> `notifyJvmtiUnmount`
>> 2. Still useful implementation of old native methods is moved from `jvm.cpp` 
>> to `jvmtiThreadState.cpp`:
>>  `JVM_VirtualThreadMountStart`   => `VTMS_mount_begin`
>>  `JVM_VirtualThreadMountEnd` => `VTMS_mount_end`
>>  `JVM_VirtualThreadUnmountStart`  = > `VTMS_unmount_begin`
>>  `JVM_VirtualThreadUnmountEnd`=> `VTMS_mount_end`
>> 3. Intrinsified the `VirtualThread` native methods: `notifyJvmtiMount`, 
>> `notifyJvmtiUnmount`, `notifyJvmtiHideFrames`.
>> 4. Removed the`VirtualThread` static boolean state variable 
>> `notifyJvmtiEvents` and its support in `javaClasses`.
>> 5. Added static boolean state variable `_VTMS_notify_jvmti_events` to the 
>> jvmtiVTMSTransitionDisabler class as a replacement of the `VirtualThread` 
>> `notifyJvmtiEvents` variable.
>> 
>> Implementing the same methods as C1 intrinsics can be needed in the future 
>> but is a low priority for now.  
>> 
>> Testing:
>>  - Ran mach5 tiers 1-6. No regressions were found.
>
> Serguei Spitsyn has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   minor tweaks in intrisics implementation

Overall, compiler changes look good. 

Any performance numbers to justify the intrinsification?

-

Marked as reviewed by vlivanov (Reviewer).

PR: https://git.openjdk.org/jdk/pull/13054