On Aug 8, 2011, at 12:35 AM, Vladimir Kozlov wrote:

> Christian,
> 
> You need to add big comment to the new code in templateTable_<arch>.cpp 
> explaining what it does and why.

Done.  I made the wording a little more general because Tom's effectively final 
work might use the same machinery.

> 
> Why on sparc you use ld_ptr() to load from cache but on X86 and X64 you use 
> movl() (only 32  bit)?

Good question.  I took the code from TemplateTable::resolve_cache_and_index 
without thinking about it and that one uses ld_ptr.

_indices in CosntantPoolCacheEntry is defined as intx:

  volatile intx     _indices;  // constant pool index & rewrite bytecodes

and bytecode 1 and 2 are in the upper 16-bit of the lower 32-bit word:

// bit number |31                0|
// bit length |-8--|-8--|---16----|
// --------------------------------
// _indices   [ b2 | b1 |  index  ]

Loading 32-bit on LE gives you the right bits but on BE it does not.  I think 
that's the reason for the "optimization" on x64.

> 
> Add assert(byte_no == -1, ) to default: case to make sure you got all cases 
> above it.

Done.

> 
> I am concern about using next short branch in new code in 
> templateTable_sparc.cpp:
> 
> cmp_and_br_short(..., L_patch_done);  // don't patch
> 
> There is __ stop() call which generates a lot of code so that label 
> L_patch_done could be far.

Yeah, I thought I give it a try if it works.  cmp_and_br_short should assert if 
the branch displacement is too far, right?

> 
> 
> Why you added new #include into ciEnv.cpp and nmethod.cpp, what code needs 
> it? Nothing else is changed in these files.

Both files use dependencies and I got linkage errors on Linux while working on 
the fix (because of inline methods).  It seems that the include is not required 
in ciEnv.cpp because ciEnv.hpp already includes it.  I missed that.  But 
nmethod.cpp needs it because nmethod.hpp only declares class Dependencies.

> 
> I don't like assignments in condition and implicit NULL checks. Can you 
> change check_dependency() to next?:
> 
>      klassOop check_dependency() {
>        klassOop result = check_klass_dependency(NULL);
>        if (result != NULL) return result;
>        return check_call_site_dependency(NULL);
>      }

Done.

> 
> In interpreterRuntime.cpp initialize marked:  int marked = 0;

OK.

> 
> Why you did not leave "volatile" call site inlining with guard? You did not 
> explain why virtual call is fine for it.

The spec of MutableCallSite says:

"For target values which will be frequently updated, consider using a volatile 
call site instead."

And VolatileCallSite says:

"A VolatileCallSite is a CallSite whose target acts like a volatile variable. 
An invokedynamic instruction linked to a VolatileCallSite sees updates to its 
call site target immediately, even if the update occurs in another thread. 
There may be a performance penalty for such tight coupling between threads.

Unlike MutableCallSite, there is no syncAll operation on volatile call sites, 
since every write to a volatile variable is implicitly synchronized with reader 
threads.

In other respects, a VolatileCallSite is interchangeable with MutableCallSite."

Since VolatileCallSite really should only be used when you know the target 
changes very often we don't do optimizations for this case.  Obviously this is 
just a guess how people will use VolatileCallSite but I think for now this is a 
safe bet.

Additionally I had to do two small changes because the build was broken on some 
configurations:

-  klassOop new_type = _changes.is_klass_change() ? 
_changes.as_klass_change()->new_type() : NULL;
+  klassOop new_type = _changes.is_klass_change() ? 
_changes.as_klass_change()->new_type() : (klassOop) NULL;

and

-      MutexLockerEx ccl(CodeCache_lock, thread);
+      MutexLockerEx ccl(CodeCache_lock, Mutex::_no_safepoint_check_flag);

I updated the webrev.

-- Christian

> 
> 
> Vladimir
> 
> On 8/5/11 6:32 AM, Christian Thalinger wrote:
>> http://cr.openjdk.java.net/~twisti/7071653
>> 
>> 7071653: JSR 292: call site change notification should be pushed not pulled
>> Reviewed-by:
>> 
>> Currently every speculatively inlined method handle call site has a
>> guard that compares the current target of the CallSite object to the
>> inlined one.  This per-invocation overhead can be removed if the
>> notification is changed from pulled to pushed (i.e. deoptimization).
>> 
>> I had to change the logic in TemplateTable::patch_bytecode to skip
>> bytecode quickening for putfield instructions when the put_code
>> written to the constant pool cache is zero.  This is required so that
>> every execution of a putfield to CallSite.target calls out to
>> InterpreterRuntime::resolve_get_put to do the deoptimization of
>> depending compiled methods.
>> 
>> I also had to change the dependency machinery to understand other
>> dependencies than class hierarchy ones.  DepChange got the super-type
>> of two new dependencies, KlassDepChange and CallSiteDepChange.
>> 
>> Tested with JRuby tests and benchmarks, hand-written testcases, JDK
>> tests and vm.mlvm tests.
>> 
>> Here is the speedup for the JRuby fib benchmark (first is JDK 7 b147,
>> second with 7071653).  Since the CallSite targets don't change during
>> the runtime of this benchmark we can see the performance benefit of
>> eliminating the guard:
>> 
>> $ jruby --server bench/bench_fib_recursive.rb 5 35
>>   0.883000   0.000000   0.883000 (  0.854000)
>>   0.715000   0.000000   0.715000 (  0.715000)
>>   0.712000   0.000000   0.712000 (  0.712000)
>>   0.713000   0.000000   0.713000 (  0.713000)
>>   0.713000   0.000000   0.713000 (  0.712000)
>> 
>> $ jruby --server bench/bench_fib_recursive.rb 5 35
>>   0.772000   0.000000   0.772000 (  0.742000)
>>   0.624000   0.000000   0.624000 (  0.624000)
>>   0.621000   0.000000   0.621000 (  0.621000)
>>   0.622000   0.000000   0.622000 (  0.622000)
>>   0.622000   0.000000   0.622000 (  0.621000)
>> 

_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Reply via email to