On Mon, 25 Aug 2025 07:13:43 GMT, Galder Zamarreño <[email protected]> wrote:
>> I've added support to vectorize `MoveD2L`, `MoveL2D`, `MoveF2I` and >> `MoveI2F` nodes. The implementation follows a similar pattern to what is >> done with conversion (`Conv*`) nodes. The tests in >> `TestCompatibleUseDefTypeSize` have been updated with the new expectations. >> >> Also added a JMH benchmark which measures throughput (the higher the number >> the better) for methods that exercise these nodes. On darwin/aarch64 it >> shows: >> >> >> Benchmark (seed) (size) Mode Cnt >> Base Patch Units Diff >> VectorBitConversion.doubleToLongBits 0 2048 thrpt 8 >> 1168.782 1157.717 ops/ms -1% >> VectorBitConversion.doubleToRawLongBits 0 2048 thrpt 8 >> 3999.387 7353.936 ops/ms +83% >> VectorBitConversion.floatToIntBits 0 2048 thrpt 8 >> 1200.338 1188.206 ops/ms -1% >> VectorBitConversion.floatToRawIntBits 0 2048 thrpt 8 >> 4058.248 14792.474 ops/ms +264% >> VectorBitConversion.intBitsToFloat 0 2048 thrpt 8 >> 3050.313 14984.246 ops/ms +391% >> VectorBitConversion.longBitsToDouble 0 2048 thrpt 8 >> 3022.691 7379.360 ops/ms +144% >> >> >> The improvements observed are a result of vectorization. The lack of >> vectorization in `doubleToLongBits` and `floatToIntBits` demonstrates that >> these changes do not affect their performance. These methods do not >> vectorize because of flow control. >> >> I've run the tier1-3 tests on linux/aarch64 and didn't observe any >> regressions. > > Galder Zamarreño has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 22 additional > commits since the last revision: > > - Merge branch 'master' into topic.fp-bits-vector > - Add more IR node positive assertions > - Fix source of data for benchmarks > - Refactor benchmarks to TypeVectorOperations > - Check at the very least that auto vectorization is supported > - Avoid VectorReinterpret::implemented > - Refactor and add copyright header > - Rephrase comment > - Removed unnecessary assert methods > - Adjust IR test after adding Move* vector support > - ... and 12 more: https://git.openjdk.org/jdk/compare/fc6e0b6f...e7e4d801 test/hotspot/jtreg/compiler/loopopts/superword/TestCompatibleUseDefTypeSize.java line 460: > 458: @IR(counts = {IRNode.LOAD_VECTOR_L, "> 0", > 459: IRNode.STORE_VECTOR, "> 0", > 460: IRNode.VECTOR_REINTERPRET, "> 0"}, Ah, I just saw that `VECTOR_REINTERPRET` is no `vectorNode`, so we don't check the size for it. Would it have a type and size though? If so, we could consider making it more precise, like all the vector casts. Would be a little bit of work, but it would make the rules more precise. Could also be a separate RFE. 2458 public static final String VECTOR_REINTERPRET = PREFIX + "VECTOR_REINTERPRET" + POSTFIX; 2459 static { 2460 beforeMatchingNameRegex(VECTOR_REINTERPRET, "VectorReinterpret"); 2461 } 2462 2463 public static final String VECTOR_UCAST_B2S = VECTOR_PREFIX + "VECTOR_UCAST_B2S" + POSTFIX; 2464 static { 2465 vectorNode(VECTOR_UCAST_B2S, "VectorUCastB2X", TYPE_SHORT); 2466 } Depending on the dump, it may not be so easy though. Not sure. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26457#discussion_r2313298675
