> Although the scalar AVX10 floating point min/max instructions (VMINMAXSD, > VMINMAXSS, VMINMAXSH) are compact, it's better not to use them in reduction > loops. This is because of serial data dependencies that get triggered across > loop iterations. An alternate implementation using comparisons and jumps > leverages branch prediction and limits the effects of data dependencies to > cheaper instructions (e.g, MOV). Please note that this method is already used > for non-AVX10 min/max reduction loop scenarios. > > With that background provided, these changes remove AVX10 floating point > min/max instructions from single and double precision floating point > reduction loops. They are replaced by the separate instruction sequence > described above. Currently, min/max half precision floating point reduction > loops aren't detectable, so they will be handled in a separate PR. There is > also some code cleanup to remove unused instruction definitions while also > adding necessary supporting infrastructure. In addition to tier-1, tier-2, > tier-3, and tier-4 tests, the JTREG tests listed below were used to verify > correctness with the recommended JVM options mentioned in corresponding > source files. All modifications and tests used [OpenJDK > v27-b12](https://github.com/openjdk/jdk/releases/tag/jdk-27%2B12) as the > baseline build. > > 1. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector64Tests.java` > 2. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector128Tests.java` > 3. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector256Tests.java` > 4. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector512Tests.java` > 5. `jtreg:test/jdk/jdk/incubator/vector/DoubleVectorMaxTests.java` > 6. `jtreg:test/jdk/jdk/incubator/vector/FloatVector64Tests.java` > 7. `jtreg:test/jdk/jdk/incubator/vector/FloatVector128Tests.java` > 8. `jtreg:test/jdk/jdk/incubator/vector/FloatVector256Tests.java` > 9. `jtreg:test/jdk/jdk/incubator/vector/FloatVector512Tests.java` > 10. `jtreg:test/jdk/jdk/incubator/vector/FloatVectorMaxTests.java` > 11. > `jtreg:test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java` > 12. > `jtreg:test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java` > 13. `jtreg:test/hotspot/jtreg/compiler/igvn/TestMinMaxIdentity.java` > 14. > `jtreg:test/hotspot/jtreg/compiler/intrinsics/float16/TestFloat16MaxMinSpecialValues.java` > 15. > `jtreg:test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxReductions.java` > > Finally, the JMH micro-benchmarks listed below were updated to ensure all > code paths are exercised. > > 1. `...
Mohamed Issa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge branch 'master' into user/missa-prime/avx10_2 - Rename instructions to match AVX512 and AVX10 conventions while also updating some code generation paths. - Rename IR match nodes in tests to account for merging of scalar min/max definitions in x86.ad file. - Refactor some of the code and re-introduce some instructions previously eliminated while also adding new ones. - Merge branch 'master' into user/missa-prime/avx10_2 - Merge branch 'master' into user/missa-prime/avx10_2 - Remove half precision min/max reduction definitions and adjust corresponding benchmarks. - Use alternative instruction flow for half precision reduction loops and add supporting infrastructure. - Merge branch 'master' into user/missa-prime/avx10_2 - Replace scalar AVX10.2 floating point min/max instructions with more efficient sequence ------------- Changes: https://git.openjdk.org/jdk/pull/29831/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29831&range=05 Stats: 754 lines in 11 files changed: 433 ins; 128 del; 193 mod Patch: https://git.openjdk.org/jdk/pull/29831.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/29831/head:pull/29831 PR: https://git.openjdk.org/jdk/pull/29831
