Re: Call site (optimizations)

Gil Tene Tue, 30 Jan 2018 16:39:10 -0800

Since new optimization tricks may pop up at any time (and there are also 
other reasons like OSR vs. non-OSR), it is never safe to assume that only a 
single compiled version of a method exists in the code cache at a point in 
time...


While having multiple versions of compilations of the same code at the same 
time may make sense for all sorts of reasons, I don't think that per-thread 
code ever makes much sense. It's the calling context (and not the specific 
thread) that tends to allow and/or require different things to happen. E.g. 
an OSR-entry compiled version of a method may be active on 10 threads, 
while 50 other threads are running the normal method entry version att he 
same time. Or, in a hypothetical optimization, different versions of the 
same target method can be called directly (with no virtual dispatch) from 
different callers, in cases where the caller didn't want to inline for some 
reason, but can still project some knowledge into the callee in a way that 
helps optimize the callee. E.g. compiled version 4 of method M is compiled 
knowing that argument 3 is a null, and is called from the 7 callers where 
this is known to be true, but a different compiled version 5 of the same 
method M is called from all other call sites.


On Tuesday, January 30, 2018 at 2:13:03 PM UTC, Francesco Nigro wrote:
>
> Gil really thank you for the super detailed answer!
>
> I'm digesting it and I've found an interesting talk with some of these 
> concepts referenced/mentioned: https://youtu.be/oH4_unx8eJQ?t=4503
>
> Douglas Hawkins:"...the compilation is shared by all threads...all 
> threads. One thread behave badly, all threads suffer."
> After your explanation I would say: "..almost all threads..."
> But considering that is about deopt it doesn't make sense such precise!
>
> My initial fairytale idea of what a JVM should able to optimize was 
> "probably" too imaginative eg 2 threads T1,T2 calling the same static 
> method:
>
> static void foo(..){
>     if(cond){
>         uncommonCase();
>     }
> }
>
> T1 with cond = true while T2 with cond = false.
> My "fairytale JVM" were able to optimize each specific call of foo() for 
> the 2 threads with specific compiled (inlined if small enough) versions 
> that benefit each one of the real code paths executed (ie uncommonCase() 
> inlined/compiled just for T1).
> Obviously the fairytale optimizations weren't only specific per-thread but 
> per call-site too :P
>
> After your answer I'm starting to believe that Santa Claus doesn't exists 
> and it's not his fault :)
>
> Thanks,
> Franz
>
>
> Il giorno sabato 27 gennaio 2018 18:45:28 UTC+1, Gil Tene ha scritto:
>>
>> Some partial answers inline below, which can probably be summarized with 
>> "it's complicated...".
>>
>> On Friday, January 26, 2018 at 8:33:53 AM UTC-8, Francesco Nigro wrote:
>>
>> HI guys,
>>>
>>> in the last period I'm having some fun playing with JItWatch (many 
>>> thanks to Chris Newland!!!) and trying to understand a lil' bit more about 
>>> specific JVM optimizations, but suddenly I've found out that I was missing 
>>> one of the most basic definition: call site.
>>>
>>> I've several questions around that:
>>>
>>>    1. There is a formal definition of call site from the point of view 
>>>    of the JIT?
>>>
>>>  I don't know about "formal". But a call site is generally any location 
>> in the bytecode of one method that explicitly causes a call to another 
>> method. These include:
>>
>> classic bytecodes used for invocation:
>> - Virtual method invocation (invokevirtual and invokeinterface, both of 
>> which calling a non-static method on an instance), which in Java tends to 
>> dynamically be the most common form.
>> - Static method invocations (invokestatic)
>> - Constructor/initializer invocation (invokespecial)
>> - Some other cool stuff (private instance method invocation with 
>> invokespecial, native calls, etc.)
>>
>> In addition, you have these "more interesting" things that can be viewed 
>> (and treated by the JIT) as call sites:
>> - MethodHandle.invoke*()
>> - reflection based invocation (Method.invoke(), Constructor.newInstance())
>> - invokedynamic (can full of Pandora worms goes here...)
>>
>>
>>>    1. I know that there are optimizations specific per call site, but 
>>>    there is a list of them somewhere (that is not the OpenJDK source code)? 
>>>    
>>> The sort of optimizations that might happen at a call site can evolve 
>> over time, and JVMs and JIT can keep adding newer optimizations: Some of 
>> the current common call site optimizations include:
>>
>> - simple inlining: the target method is known  (e.g. it is a static 
>> method, a constructor, or a 
>> we-know-there-is-only-one-implementor-of-this-instance-method-for-this-type-and-all-of-its-subtypes),
>>  
>> and can be unconditionally inlined at the call site.
>> - guarded inlining: the target method is assumed to be a specific method 
>> (which we go ahead and inline), but a check (e.g. the exact type of this 
>> animal is actually a dog) is required ahead of the inlined code because we 
>> can't prove the assumption is true.
>> - bi-morphic and tri-morphic variants of guarded inlining exist (where 
>> two or three different targets are inlined).
>> - Inline cache: A virtual invocation dispatch (which would need to follow 
>> the instance's class to locate a target method) is replaced with a guarded 
>> static invocation to a specific target on the assumption a specific 
>> ("monomorphic") callee type. "bi-morphic" and "tri-morphic" variants of 
>> inline cache exist (where one of two or three static callees are called 
>> depending on a type check, rather than performing a full virtual dispatch)
>> ...
>>  
>> But there are bigger and more subtle things that can be optimized at and 
>> around a call site, which may not be directly evident from the calling code 
>> itself. Even when a call site "stays", things like this can happen:
>>
>> - Analysis of all possible callees shows that no writes to some locations 
>> are possible and that no order-enforcing operations will occur, allowing 
>> the elimination of re-reading of those locations after the call. [this can 
>> e.g. let us hoist reads out of loops containing calls].
>> - Analysis of all possible callees shows that no reads of some locations 
>> are possible and that no order-enforcing operations will occur, allowing 
>> the delay/elimination of writes to those locations [this can e.g. allow us 
>> to sink writes such that occur once after a loop with calls in it].
>> - Analysis of all possible callees shows that an object passed as an 
>> argument does not escape to the heap, allowing certain optimizations (e.g. 
>> eliminating locks on the object if it was allocated in the caller and never 
>> escaped since we now know it is thread-local)
>> - ... (many more to come)
>>
>>
>>>    1. 
>>>    2. I know that compiled code from the JVM is available in a Code 
>>>    Cache to allow different call-sites to use it: that means that the same 
>>>    compiled method is used in all those call-sites (provided that's the 
>>> best 
>>>    version of it)?
>>>    
>>>
>> Not necessarily. 
>>
>>
>> - The same method may exist in the code cache in multiple places:
>>   - Each tier may retain compiles of the same method.
>>   - Methods compiled for OSR-entry (e.g. transition from a lower tier 
>> version of the same method in the middle of a long-running loop) are 
>> typically a separate code cache entry that ones compiled for entry via 
>> normal invocation.
>>   - Each location where a method is inlined is technically a separate 
>> copy of that method in the cache.
>>   - ...
>>
>> - When an actual virtual invocation is made, that invocation will 
>> normally go to the currently installed version of the method in the code 
>> cache (if one exists). However, because the JVM works very hard to avoid 
>> actual virtual invocations (and tend to succeed in doing this 90%+ of the 
>> time) some invocation sites may call other versions of the same method.
>>
>> - Some optimizations and/or profiling techniques may end up creating two 
>> or more different and specialized compiled versions of the same 
>> method, choosing which one to call depending on things that are known to or 
>> decided by the caller.
>>  
>>
>>>
>>>    1. 
>>>
>>> I hope to not have asked too naive questions :)
>>>
>>> thanks
>>> Franz
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Call site (optimizations)

Reply via email to