On Wed, 18 Oct 2023 13:04:35 GMT, Emanuel Peter <epe...@openjdk.org> wrote:
> This is a feature requiested by @RogerRiggs and @cl4es . > > **Idea** > > Merging multiple consecutive small stores (e.g. 8 byte stores) into larger > stores (e.g. one long store) can lead to speedup. > Recently, @cl4es and @RogerRiggs had to review a few PR's where people would > try to get > speedups by using > Unsafe (e.g. `Unsafe.putLongUnaligned`), or > ByteArrayLittleEndian (e.g. `ByteArrayLittleEndian.setLong`). > They have asked if we can do such an optimization in C2, rather than in the > Java library code, or even user code. > > This patch here supports a few simple use-cases, like these: > > Merge consecutive array stores, with constants. We can combine the separate > constants into a larger constant: > https://github.com/openjdk/jdk/blob/adca9e220822884d95d73c7f070adeee2632130d/test/hotspot/jtreg/compiler/c2/TestMergeStores.java#L383-L395 > > Merge consecutive array stores, with a variable that was split (using > shifts). We can essentially undo the > splitting (i.e. shifting and truncation), and directly store the variable: > https://github.com/openjdk/jdk/blob/adca9e220822884d95d73c7f070adeee2632130d/test/hotspot/jtreg/compiler/c2/TestMergeStores.java#L444-L456 > > The idea is that this would allow the introduction of a very simple API, > without any "heavy" dependencies (Unsafe or ByteArrayLittleEndian): > > https://github.com/openjdk/jdk/blob/adca9e220822884d95d73c7f070adeee2632130d/test/hotspot/jtreg/compiler/c2/TestMergeStores.java#L327-L338 > https://github.com/openjdk/jdk/blob/adca9e220822884d95d73c7f070adeee2632130d/test/hotspot/jtreg/compiler/c2/TestMergeStores.java#L467-L472 > > **Details** > > This draft currently implements the optimization in an additional special > IGVN phase: > https://github.com/openjdk/jdk/blob/adca9e220822884d95d73c7f070adeee2632130d/src/hotspot/share/opto/compile.cpp#L2479-L2485 > > We first collect all `StoreB|C|I`, and put them in the IGVN worklist (see > `Compile::gather_nodes_for_merge_stores`). > During IGVN, we call `StoreNode::Ideal_merge_stores` at the end (after all > other optimizations) of `StoreNode::Ideal`. > We essentially try to establish a chain of mergable stores: > > https://github.com/openjdk/jdk/blob/adca9e220822884d95d73c7f070adeee2632130d/src/hotspot/share/opto/memnode.cpp#L2802-L2806 > > Mergable stores must have the same Opcode (implies they have the same element > type and hence size). > Further, mergable stores must have the same control (or be separated by only > a RangeCheck). > Further, they must either both store constants, or adjacent segments of a > larger value... I imagine it would be beneficial if we could merge stores to fields and stores from loads, which are common in object constructions. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16245#issuecomment-1778435561