On Thu, 10 Dec 2020 13:15:41 GMT, Maurizio Cimadamore <[email protected]>
wrote:
> As a result of the recent integration of the foreign memory access API, some
> of the buffer implementations now use `ScopedMemoryAccess` instead of
> `Unsafe`. While this works generally well, there are situations where profile
> pollution arises, which result in a considerable slowdown. The profile
> pollution occurs because the same ScopedMemoryAccess method (e.g.
> `getIntUnaligned`) is called with two different buffer kinds (e.g. an off
> heap buffer where base == null, and an on-heap buffer where base == byte[]).
> Because of that, unsafe access cannot be optimized, since C2 can't guess what
> the unsafe base access is.
>
> In reality, this problem was already known (and solved) elsewhere: the
> sun.misc.Unsafe wrapper does basically the same thing that ScopedMemoryAccess
> does. To make sure that profile pollution does not occur in those cases,
> argument profiling is enabled for sun.misc.Unsafe as well. This patch adds
> yet another case for ScopedMemoryAccess.
>
> Here are the benchmark results:
>
> Before:
>
> Benchmark Mode Cnt Score Error
> Units
> LoopOverPollutedBuffer.direct_byte_buffer_get_float avgt 30 0.612 ? 0.005
> ms/op
> LoopOverPollutedBuffer.heap_byte_buffer_get_int avgt 30 2.740 ? 0.039
> ms/op
> LoopOverPollutedBuffer.unsafe_get_float avgt 30 0.504 ? 0.020
> ms/op
>
> After
>
> Benchmark Mode Cnt Score Error
> Units
> LoopOverPollutedBuffer.direct_byte_buffer_get_float avgt 30 0.613 ? 0.007
> ms/op
> LoopOverPollutedBuffer.heap_byte_buffer_get_int avgt 30 0.304 ? 0.002
> ms/op
> LoopOverPollutedBuffer.unsafe_get_float avgt 30 0.491 ? 0.004
> ms/op
src/hotspot/share/oops/methodData.cpp line 1593:
> 1591: ResourceMark rm;
> 1592: char* name = inv.name()->as_C_string();
> 1593: if (!strncmp(name, "get", 3) || !strncmp(name, "put", 3)) {
Pre-existing, but `!strncmp(name, "get", 3)` seem a very circumspect way of
writing `inv->name()->starts_with("get")` - which shouldn't need a
`ResourceMark` either. Another observation is that `inv.klass()` isn't inlined
(defined in bytecode.cpp), so introducing a local for `inv.klass()` avoids
multiple calls. How about this:
if (inv.is_invokevirtual()) {
Symbol* klass = inv.klass();
if (klass == vmSymbols::jdk_internal_misc_Unsafe() ||
klass == vmSymbols::sun_misc_Unsafe() ||
klass == vmSymbols::jdk_internal_misc_ScopedMemoryAccess()) {
Symbol* name = inv.name();
if (name->starts_with("get") || name->starts_with("put")) {
return true;
}
}
}
-------------
PR: https://git.openjdk.java.net/jdk/pull/1733