Re: [9] RFR (M): 8069591: Customize LambdaForms which are invoked using MH.invoke/invokeExact

2015-01-21 Thread John Rose
On Jan 21, 2015, at 9:31 AM, Remi Forax  wrote:
> 
> in Invokers.java, I think that checkCustomized should take an Object and not 
> a MethodHandle
> exactly like getCallSiteTarget takes an Object and not a CallSite.

The use of erased types (any ref => Object) in the MH runtime is an artifact of 
bootstrapping difficulties, early in the project.  I hope it is not necessary 
any more.  That said, I agree that the pattern should be consistent.

Vladimir, would you please file a tracking bug for this cleanup, to change MH 
library functions to use stronger types instead of Object?

> in MethodHandle.java, customizationCount is declared as a byte and there is 
> no check that
> the CUSTOMIZE_THRESHOLD is not greater than 127.

Yes.  Also, the maybeCustomize method has a race condition that could cause the 
counter to wrap.  It shouldn't use "+=1" to increment; it should load the old 
counter value, test it, increment it (in a local), and then store the updated 
value.  That is also one possible place to deal with jumbo CUSTOMIZE_THRESHOLD 
values.

— John___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8069591: Customize LambdaForms which are invoked using MH.invoke/invokeExact

2015-01-21 Thread Remi Forax

Hi Vladimir,
in Invokers.java, I think that checkCustomized should take an Object and 
not a MethodHandle

exactly like getCallSiteTarget takes an Object and not a CallSite.

in MethodHandle.java, customizationCount is declared as a byte and there 
is no check that

the CUSTOMIZE_THRESHOLD is not greater than 127.

cheers,
Rémi

On 01/21/2015 05:25 PM, Vladimir Ivanov wrote:

http://cr.openjdk.java.net/~vlivanov/8069591/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8069591

Overhead of non-inlined MH.invoke/invokeExact calls significantly 
increased with LambdaForm sharing. The cause is JIT compiler can't 
produce a single nmethod for the whole MethodHandle chain, so the 
execution is spread around numerous nmethods (1 per each MethodHandle 
in the chain). The longer the chain the larger overhead.


The fix is to customize LambdaForms (create a dedicated LambdaForm for 
a MethodHandle). Per-MethodHandle count is introduced, which is 
incremented every time a MethodHandle is invoked using 
MethodHandle.invoke/invokeExact. Once CUSTOMIZE_THRESHOLD is reached 
for a particular MethodHandle, it's LambdaForm is substituted with a 
customized one, which has it's MethodHandle embedded. It allows JIT to 
see actual MethodHandle during compilation and produce more efficient 
code.


This fix completely recovers Gbemu peak performance to pre-LambdaForm 
sharing level.


Testing: jck (api/java_lang/invoke), jdk/java/lang/invoke, nashorn 
tests, nashorn/octane


Thanks!

Best regards,
Vladimir Ivanov
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


[9] RFR (M): 8069591: Customize LambdaForms which are invoked using MH.invoke/invokeExact

2015-01-21 Thread Vladimir Ivanov

http://cr.openjdk.java.net/~vlivanov/8069591/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8069591

Overhead of non-inlined MH.invoke/invokeExact calls significantly 
increased with LambdaForm sharing. The cause is JIT compiler can't 
produce a single nmethod for the whole MethodHandle chain, so the 
execution is spread around numerous nmethods (1 per each MethodHandle in 
the chain). The longer the chain the larger overhead.


The fix is to customize LambdaForms (create a dedicated LambdaForm for a 
MethodHandle). Per-MethodHandle count is introduced, which is 
incremented every time a MethodHandle is invoked using 
MethodHandle.invoke/invokeExact. Once CUSTOMIZE_THRESHOLD is reached for 
a particular MethodHandle, it's LambdaForm is substituted with a 
customized one, which has it's MethodHandle embedded. It allows JIT to 
see actual MethodHandle during compilation and produce more efficient code.


This fix completely recovers Gbemu peak performance to pre-LambdaForm 
sharing level.


Testing: jck (api/java_lang/invoke), jdk/java/lang/invoke, nashorn 
tests, nashorn/octane


Thanks!

Best regards,
Vladimir Ivanov
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: JFokus 2015 - the VM Tech Day

2015-01-21 Thread Marcus Lagergren
Btw,

I have a few 50% discounts left for the VM tech day. If you are interested, 
please e-mail me directly!

/Marcus

> On 19 Jan 2015, at 10:58, Marcus Lagergren  
> wrote:
> 
> And to further clarify things - you can attend _only_ the VM Tech day / tech 
> summit, should you so desire, and skip the rest of the JFokus conference. 
> (What a strange thing to do, given the quality of JFokus, but I can’t be the 
> one questioning your priorities here)
> 
> (http://www.jfokus.se/jfokus/register.jsp 
> )
> 
> /M
> 
>> On 18 Jan 2015, at 22:54, Marcus Lagergren > > wrote:
>> 
>> Greetings community members!
>> 
>> Here is something that I'm sure you'll find interesting.
>> 
>> I want to advertise the upcoming "VM tech day” event, scheduled to
>> take place February 2, 2015 at the JFokus conference in
>> Stockholm. Sorry I am on a bit of a short notice here, but finalizing
>> the speaker list took us a bit more time than expected.
>> 
>> The VM tech day is a mini-track that runs the first day of the JFokus
>> conference. This is its schedule: 
>> https://www.jfokus.se/jfokus/jvmtech.jsp 
>> 
>> 
>> After some rather challenging months of jigsaw puzzles, it is with
>> great pleasure that I can announce that our speaker line up is now
>> complete - and it is great indeed! We are talking 100% gurus,
>> prophets, ninjas, rock stars, and all other similar terms that
>> normally gets your resume binned if it passes my desk. But in this
>> case the labels are true. We have strictly top names from both the
>> commercial world and from academia ready to take you on a great
>> ride.
>> 
>> So what is the VM tech day? For those of you familiar with the JVM
>> Language Summit (JVMLS) that usually takes place in Santa Clara in
>> the summers, the format is similar. It’s the usual deal: anyone
>> morbidly interested in runtime internals, code generation, polyglot
>> programming and the complexities of language implementation, should
>> find a veritable gold mine of stimulating conversation and knowledge
>> transfer here. What is different from a typical JVMLS (except for the
>> shorter duration), is that we have widened the scope a bit to include
>> several runtimes, language implementation issues and polyglot
>> problems.
>> 
>> There will be six scheduled sessions and plenty of time for breakouts
>> and discussions. We will also heavily encourage audience interaction
>> and participation.
>> 
>> The JFokus VM tech day is opened by John Rose. I am sure John needs 
>> no introduction to the subscribers of this list. With advanced OpenJDK
>> projects like Valhalla and Panama booting up, John will discuss what
>> the JVM has in store for the future. 
>> 
>> Other speakers include the tireless Charlie Nutter from Red Hat, the
>> formidable Remi Forax, the brilliant Vyacheslav Egorov of Google v8
>> fame, the esteemed Dan Heidinga from IBM and the good looking Attila
>> Szegedi from Oracle.
>> 
>> We also have plenty of non-speaking celebrity participants in the
>> audience, for example Fredrik Öhrström: invokedynamic specification 
>> wizard extraordinaire and architect behind the new OpenJDK build
>> system. Stop by and get autographs ;)
>> 
>> Thusly: if you are attending JFokus, or if you are making up your mind
>> about attending it right now, the VM tech summit is definitely
>> something anyone subscribing to mlvm-dev wouldn't want to miss. The
>> cross-platform/cross-technology/cross-company focus that we have tried
>> very hard to create will without a doubt be ultra stimulating. Of that
>> you can be sure.
>> 
>> Please help us spread the word in whatever forums you deem
>> appropriate! Talk to you friends! Tweet links to this post! Yell from
>> your cubicle soap boxes across the neverending seas of fluorescent
>> lights!
>> 
>> Any further questions you may have about the event, not answered by
>> the web pages, can be directed either to me (@lagergren) or Mattias 
>> Karlsson (@matkar) or as replies to this e-mail thread.
>> 
>> On behalf of JFokus / VM Tech Day 2015
>> Marcus Lagergren
>> Master of ceremonies (or something)
>> 
>> ___
>> mlvm-dev mailing list
>> mlvm-dev@openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
> 
> ___
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-21 Thread Vladimir Ivanov
Duncan, sorry for that.
Updated webrev inplace.

Best regards,
Vladimir Ivanov

On 1/21/15 1:39 PM, MacGregor, Duncan (GE Energy Management) wrote:
> This version seems to have inconsistent removal of ignore profile in the
> hotspot patch. It’s no longer added to vmSymbols but is still referenced
> in classFileParser.
> 
> On 19/01/2015 20:21, "MacGregor, Duncan (GE Energy Management)"
>  wrote:
> 
>> Okay, I¹ve done some tests of this with the micro benchmarks for our
>> language & runtime which show pretty much no change except for one test
>> which is now almost 3x slower. It uses nested loops to iterate over an
>> array and concatenate the string-like objects it contains, and replaces
>> elements with these new longer string-llike objects. It¹s a bit of a
>> pathological case, and I haven¹t seen the same sort of degradation in the
>> other benchmarks or in real applications, but I haven¹t done serious
>> benchmarking of them with this change.
>>
>> I shall see if the test case can be reduced down to anything simpler while
>> still showing the same performance behaviour, and try add some compilation
>> logging options to narrow down what¹s going on.
>>
>> Duncan.
>>
>> On 16/01/2015 17:16, "Vladimir Ivanov" 
>> wrote:
>>
>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
>>> https://bugs.openjdk.java.net/browse/JDK-8063137
>>>
>>> After GuardWithTest (GWT) LambdaForms became shared, profile pollution
>>> significantly distorted compilation decisions. It affected inlining and
>>> hindered some optimizations. It causes significant performance
>>> regressions for Nashorn (on Octane benchmarks).
>>>
>>> Inlining was fixed by 8059877 [1], but it didn't cover the case when a
>>> branch is never taken. It can cause missed optimization opportunity, and
>>> not just increase in code size. For example, non-pruned branch can break
>>> escape analysis.
>>>
>>> Currently, there are 2 problems:
>>>- branch frequencies profile pollution
>>>- deoptimization counts pollution
>>>
>>> Branch frequency pollution hides from JIT the fact that a branch is
>>> never taken. Since GWT LambdaForms (and hence their bytecode) are
>>> heavily shared, but the behavior is specific to MethodHandle, there's no
>>> way for JIT to understand how particular GWT instance behaves.
>>>
>>> The solution I propose is to do profiling in Java code and feed it to
>>> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
>>> profiling info is stored. Once JIT kicks in, it can retrieve these
>>> counts, if corresponding MethodHandle is a compile-time constant (and it
>>> is usually the case). To communicate the profile data from Java code to
>>> JIT, MethodHandleImpl::profileBranch() is used.
>>>
>>> If GWT MethodHandle isn't a compile-time constant, profiling should
>>> proceed. It happens when corresponding LambdaForm is already shared, for
>>> newly created GWT MethodHandles profiling can occur only in native code
>>> (dedicated nmethod for a single LambdaForm). So, when compilation of the
>>> whole MethodHandle chain is triggered, the profile should be already
>>> gathered.
>>>
>>> Overriding branch frequencies is not enough. Statistics on
>>> deoptimization events is also polluted. Even if a branch is never taken,
>>> JIT doesn't issue an uncommon trap there unless corresponding bytecode
>>> doesn't trap too much and doesn't cause too many recompiles.
>>>
>>> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
>>> sees it on some method, Compile::too_many_traps &
>>> Compile::too_many_recompiles for that method always return false. It
>>> allows JIT to prune the branch based on custom profile and recompile the
>>> method, if the branch is visited.
>>>
>>> For now, I wanted to keep the fix very focused. The next thing I plan to
>>> do is to experiment with ignoring deoptimization counts for other
>>> LambdaForms which are heavily shared. I already saw problems caused by
>>> deoptimization counts pollution (see JDK-8068915 [2]).
>>>
>>> I plan to backport the fix into 8u40, once I finish extensive
>>> performance testing.
>>>
>>> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
>>> Octane).
>>>
>>> Thanks!
>>>
>>> PS: as a summary, my experiments show that fixes for 8063137 & 8068915
>>> [2] almost completely recovers peak performance after LambdaForm sharing
>>> [3]. There's one more problem left (non-inlined MethodHandle invocations
>>> are more expensive when LFs are shared), but it's a story for another
>>> day.
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8059877
>>>  8059877: GWT branch frequencies pollution due to LF sharing
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8068915
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8046703
>>>  JEP 210: LambdaForm Reduction and Caching
>>> ___
>>> mlvm-dev

Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-21 Thread MacGregor, Duncan (GE Energy Management)
This version seems to have inconsistent removal of ignore profile in the
hotspot patch. It’s no longer added to vmSymbols but is still referenced
in classFileParser.

On 19/01/2015 20:21, "MacGregor, Duncan (GE Energy Management)"
 wrote:

>Okay, I¹ve done some tests of this with the micro benchmarks for our
>language & runtime which show pretty much no change except for one test
>which is now almost 3x slower. It uses nested loops to iterate over an
>array and concatenate the string-like objects it contains, and replaces
>elements with these new longer string-llike objects. It¹s a bit of a
>pathological case, and I haven¹t seen the same sort of degradation in the
>other benchmarks or in real applications, but I haven¹t done serious
>benchmarking of them with this change.
>
>I shall see if the test case can be reduced down to anything simpler while
>still showing the same performance behaviour, and try add some compilation
>logging options to narrow down what¹s going on.
>
>Duncan.
>
>On 16/01/2015 17:16, "Vladimir Ivanov" 
>wrote:
>
>>http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
>>http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
>>https://bugs.openjdk.java.net/browse/JDK-8063137
>>
>>After GuardWithTest (GWT) LambdaForms became shared, profile pollution
>>significantly distorted compilation decisions. It affected inlining and
>>hindered some optimizations. It causes significant performance
>>regressions for Nashorn (on Octane benchmarks).
>>
>>Inlining was fixed by 8059877 [1], but it didn't cover the case when a
>>branch is never taken. It can cause missed optimization opportunity, and
>>not just increase in code size. For example, non-pruned branch can break
>>escape analysis.
>>
>>Currently, there are 2 problems:
>>   - branch frequencies profile pollution
>>   - deoptimization counts pollution
>>
>>Branch frequency pollution hides from JIT the fact that a branch is
>>never taken. Since GWT LambdaForms (and hence their bytecode) are
>>heavily shared, but the behavior is specific to MethodHandle, there's no
>>way for JIT to understand how particular GWT instance behaves.
>>
>>The solution I propose is to do profiling in Java code and feed it to
>>JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
>>profiling info is stored. Once JIT kicks in, it can retrieve these
>>counts, if corresponding MethodHandle is a compile-time constant (and it
>>is usually the case). To communicate the profile data from Java code to
>>JIT, MethodHandleImpl::profileBranch() is used.
>>
>>If GWT MethodHandle isn't a compile-time constant, profiling should
>>proceed. It happens when corresponding LambdaForm is already shared, for
>>newly created GWT MethodHandles profiling can occur only in native code
>>(dedicated nmethod for a single LambdaForm). So, when compilation of the
>>whole MethodHandle chain is triggered, the profile should be already
>>gathered.
>>
>>Overriding branch frequencies is not enough. Statistics on
>>deoptimization events is also polluted. Even if a branch is never taken,
>>JIT doesn't issue an uncommon trap there unless corresponding bytecode
>>doesn't trap too much and doesn't cause too many recompiles.
>>
>>I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
>>sees it on some method, Compile::too_many_traps &
>>Compile::too_many_recompiles for that method always return false. It
>>allows JIT to prune the branch based on custom profile and recompile the
>>method, if the branch is visited.
>>
>>For now, I wanted to keep the fix very focused. The next thing I plan to
>>do is to experiment with ignoring deoptimization counts for other
>>LambdaForms which are heavily shared. I already saw problems caused by
>>deoptimization counts pollution (see JDK-8068915 [2]).
>>
>>I plan to backport the fix into 8u40, once I finish extensive
>>performance testing.
>>
>>Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
>>Octane).
>>
>>Thanks!
>>
>>PS: as a summary, my experiments show that fixes for 8063137 & 8068915
>>[2] almost completely recovers peak performance after LambdaForm sharing
>>[3]. There's one more problem left (non-inlined MethodHandle invocations
>>are more expensive when LFs are shared), but it's a story for another
>>day.
>>
>>Best regards,
>>Vladimir Ivanov
>>
>>[1] https://bugs.openjdk.java.net/browse/JDK-8059877
>> 8059877: GWT branch frequencies pollution due to LF sharing
>>[2] https://bugs.openjdk.java.net/browse/JDK-8068915
>>[3] https://bugs.openjdk.java.net/browse/JDK-8046703
>> JEP 210: LambdaForm Reduction and Caching
>>___
>>mlvm-dev mailing list
>>mlvm-dev@openjdk.java.net
>>http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>
>___
>mlvm-dev mailing list
>mlvm-dev@openjdk.java.net
>http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-d