The current AArch64 implementation of ArraysSupport.vectorizedHashCode 
processes polynomial reductions in relatively small groups, which limits 
parallelism in the hash accumulation path for large arrays.

This change increases polynomial batch size to 16-element groups using a larger 
precomputed powers-of-31 table. The updated implementation enables more 
independent multiply operations and reduces dependency chains in the main 
hashing loop.

The optimization also reduces generated stub size for all supported element 
types, lowering instruction cache pressure in hot hashing workloads.

The optimization applies to boolean[], byte[], char[], short[], and int[] array 
hashing paths and is enabled only for array lengths >= 8. Shorter arrays 
continue to use the existing scalar implementation.

Generated stub size reduction:


| Element type | New size | JDK 25 size | Reduction | 
| ------------ | -------- | ----------- | --------- |
| boolean      | 332 B    | 428 B       | -96 B     |
| byte         | 332 B    | 428 B       | -96 B     |
| char         | 332 B    | 408 B       | -76 B     |
| short        | 332 B    | 408 B       | -76 B     |
| int          | 300 B    | 324 B       | -24 B     |

## BYTE[] Arrays.hashCode throughput (ops/ms):
Lengths below 8 use the existing scalar path and are therefore expected to show 
no meaningful change.

| Length | Baseline | New    | Improvement |
|--------|----------|--------|-------------|
| 2      | 696842   | 681572 | -2.2%       |
| 7      | 349082   | 349392 | +0.1%       |
| 8      | 309193   | 395677 | +28.0%      |
| 9      | 294240   | 367510 | +24.9%      |
| 15     | 160372   | 202718 | +26.4%      |
| 16     | 241651   | 348854 | +44.4%      |
| 17     | 228929   | 308820 | +34.9%      |
| 23     | 139463   | 186679 | +33.9%      |
| 24     | 177955   | 253809 | +42.6%      |
| 25     | 173594   | 253786 | +46.2%      |
| 31     | 113638   | 159672 | +40.5%      |
| 32     | 164228   | 214765 | +30.8%      |
| 33     | 155093   | 199425 | +28.6%      |
| 47     | 103190   | 135190 | +31.0%      |
| 48     | 116600   | 145178 | +24.5%      |
| 49     | 112067   | 163144 | +45.6%      |
| 63     | 79978    | 116111 | +45.2%      |
| 64     | 104182   | 130175 | +25.0%      |
| 65     | 101735   | 125010 | +22.9%      |


## CHAR[] Arrays.hashCode throughput (ops/ms)

| Length | Baseline | New    | Improvement |
|--------|----------|--------|-------------|
| 2      | 696254   | 696646 | +0.1%       |
| 7      | 351199   | 347674 | -1.0%       |
| 8      | 307065   | 398830 | +29.9%      |
| 9      | 279152   | 373828 | +33.9%      |
| 15     | 168873   | 211161 | +25.0%      |
| 16     | 246685   | 359181 | +45.6%      |
| 17     | 231574   | 319731 | +38.1%      |
| 23     | 140617   | 193354 | +37.5%      |
| 24     | 188697   | 289453 | +53.4%      |
| 25     | 181149   | 265244 | +46.4%      |
| 31     | 114859   | 168630 | +46.8%      |
| 32     | 178221   | 207204 | +16.3%      |
| 33     | 171169   | 231739 | +35.4%      |
| 47     | 105332   | 145419 | +38.1%      |
| 48     | 120754   | 197517 | +63.6%      |
| 49     | 115156   | 184969 | +60.6%      |
| 63     | 83664    | 127759 | +52.7%      |
| 64     | 119575   | 154688 | +29.4%      |
| 65     | 116870   | 147749 | +26.4%      |

## SHORT[] Arrays.hashCode throughput (ops/ms)

| Length | Baseline | New    | Improvement |
|--------|----------|--------|-------------|
| 2      | 697735   | 696917 | -0.1%       |
| 7      | 350484   | 348131 | -0.7%       |
| 8      | 305960   | 398837 | +30.4%      |
| 9      | 279146   | 367976 | +31.8%      |
| 15     | 167151   | 211794 | +26.7%      |
| 16     | 246754   | 358048 | +45.1%      |
| 17     | 231731   | 321910 | +38.9%      |
| 23     | 139937   | 188696 | +34.8%      |
| 24     | 184464   | 289120 | +56.7%      |
| 25     | 181133   | 265296 | +46.5%      |
| 31     | 114607   | 167787 | +46.4%      |
| 32     | 178193   | 259802 | +45.8%      |
| 33     | 171439   | 231916 | +35.3%      |
| 47     | 105341   | 145975 | +38.6%      |
| 48     | 120779   | 197006 | +63.1%      |
| 49     | 115701   | 185225 | +60.1%      |
| 63     | 83677    | 127688 | +52.6%      |
| 64     | 112239   | 155357 | +38.4%      |
| 65     | 116872   | 147735 | +26.4%      |

## INT[] Arrays.hashCode throughput (ops/ms)

| Length | Baseline | New    | Improvement |
|--------|----------|--------|-------------|
| 2      | 697667   | 697866 | +0.0%       |
| 7      | 351776   | 349918 | -0.5%       |
| 8      | 279132   | 398794 | +42.9%      |
| 9      | 282044   | 369000 | +30.8%      |
| 15     | 216797   | 212897 | -1.8%       |
| 16     | 228853   | 376437 | +64.5%      |
| 17     | 206776   | 310186 | +50.0%      |
| 23     | 168377   | 198746 | +18.0%      |
| 24     | 184100   | 278781 | +51.5%      |
| 25     | 172023   | 253821 | +47.6%      |
| 31     | 138354   | 171569 | +24.0%      |
| 32     | 173431   | 253249 | +46.0%      |
| 33     | 164210   | 232667 | +41.7%      |
| 47     | 117697   | 146898 | +24.8%      |
| 48     | 139514   | 192511 | +38.0%      |
| 49     | 134649   | 158293 | +17.6%      |
| 63     | 101384   | 132083 | +30.3%      |
| 64     | 118405   | 160644 | +35.7%      |
| 65     | 112848   | 146963 | +30.2%      |




---------
- [x] I confirm that I make this contribution in accordance with the [OpenJDK 
Interim AI Policy](https://openjdk.org/legal/ai).

-------------

Commit messages:
 - 8385513: AArch64: Improve ArraysSupport.vectorizedHashCode performance for 
large arrays

Changes: https://git.openjdk.org/jdk/pull/31674/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=31674&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8385513
  Stats: 596 lines in 6 files changed: 330 ins; 144 del; 122 mod
  Patch: https://git.openjdk.org/jdk/pull/31674.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/31674/head:pull/31674

PR: https://git.openjdk.org/jdk/pull/31674

Reply via email to