This patch expands the use of a hash table for secondary superclasses
to the interpreter, C1, and runtime. It also adds a C2 implementation
of hashed lookup in cases where the superclass isn't known at compile
time.

HotSpot shared runtime
----------------------

Building hashed secondary tables is now unconditional. It takes very
little time, and now that the shared runtime always has the tables, it
might as well take advantage of them. The shared code is easier to
follow now, I think.

There might be a performance issue with x86-64 in that we build
HotSpot for a default x86-64 target that does not support popcount.
This means that HotSpot C++ runtime on x86 always uses a software
emulation for popcount, even though the vast majority of machines made
for the past 20 years can do popcount in a single instruction. It
wouldn't be terribly hard to do something about that.

Having said that, the software popcount is really not bad.

x86
---

x86 is rather tricky, because we still support
`-XX:-UseSecondarySupersTable` and `-XX:+UseSecondarySupersCache`, as
well as 32- and 64-bit ports. There's some further complication in
that only `RCX` can be used as a shift count, so there's some register
shuffling to do. All of this makes the logic in macroAssembler_x86.cpp
rather gnarly, with multiple levels of conditionals at compile time
and runtime.

AArch64
-------

AArch64 is considerably more straightforward. We always have a
popcount instruction and (thankfully) no 32-bit code to worry about.

Generally
---------

I would dearly love simply to rip out the "old" secondary supers cache
support, but I've left it in just in case someone has a performance
regression.

The versions of `MacroAssembler::lookup_secondary_supers_table` that
work with variable superclasses don't take a fixed set of temp
registers, and neither do they call out to to a slow path subroutine.
Instead, the slow patch is expanded inline.

I don't think this is necessarily bad. Apart from the very rare cases
where C2 can't determine the superclass to search for at compile time,
this code is only used for generating stubs, and it seemed to me
ridiculous to have stubs calling other stubs.

I've followed the guidance from @iwanowww not to obsess too much about
the performance of C1-compiled secondary supers lookups, and to prefer
simplicity over absolute performance. Nonetheless, this is a
complicated patch that touches many areas.

-------------

Commit messages:
 - Cleanup tests
 - small
 - Small
 - Temp
 - Merge remote-tracking branch 'refs/remotes/origin/JDK-8331658-work' into 
JDK-8331658-work
 - Fix x86-32
 - Fix x86
 - Temp
 - Temp
 - Temp
 - ... and 16 more: https://git.openjdk.org/jdk/compare/747e1e47...7d7694cc

Changes: https://git.openjdk.org/jdk/pull/19989/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19989&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8331341
  Stats: 886 lines in 13 files changed: 755 ins; 69 del; 62 mod
  Patch: https://git.openjdk.org/jdk/pull/19989.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19989/head:pull/19989

PR: https://git.openjdk.org/jdk/pull/19989

Reply via email to