> Although the scalar AVX10 floating point min/max instructions (VMINMAXSD, 
> VMINMAXSS, VMINMAXSH) are compact, it's better not to use them in reduction 
> loops. This is because of serial data dependencies that get triggered across 
> loop iterations. An alternate implementation using comparisons and jumps 
> leverages branch prediction and limits the effects of data dependencies to 
> cheaper instructions (e.g, MOV). Please note that this method is already used 
> for non-AVX10 min/max reduction loop scenarios.
> 
> With that background provided, these changes remove AVX10 floating point 
> min/max instructions from single and double precision floating point 
> reduction loops. They are replaced by the separate instruction sequence 
> described above. Currently, min/max half precision floating point reduction 
> loops aren't detectable, so they will be handled in a separate PR. There is 
> also some code cleanup to remove unused instruction definitions while also 
> adding necessary supporting infrastructure. In addition to tier-1, tier-2, 
> tier-3, and tier-4 tests, the JTREG tests listed below were used to verify 
> correctness with the recommended JVM options mentioned in corresponding 
> source files. All modifications and tests used [OpenJDK 
> v27-b12](https://github.com/openjdk/jdk/releases/tag/jdk-27%2B12) as the 
> baseline build.
> 
> 1. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector64Tests.java`
> 2. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector128Tests.java`
> 3. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector256Tests.java`
> 4. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector512Tests.java`
> 5. `jtreg:test/jdk/jdk/incubator/vector/DoubleVectorMaxTests.java`
> 6. `jtreg:test/jdk/jdk/incubator/vector/FloatVector64Tests.java`
> 7. `jtreg:test/jdk/jdk/incubator/vector/FloatVector128Tests.java`
> 8. `jtreg:test/jdk/jdk/incubator/vector/FloatVector256Tests.java`
> 9. `jtreg:test/jdk/jdk/incubator/vector/FloatVector512Tests.java`
> 10. `jtreg:test/jdk/jdk/incubator/vector/FloatVectorMaxTests.java`
> 11. 
> `jtreg:test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java`
> 12. 
> `jtreg:test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java`
> 13. `jtreg:test/hotspot/jtreg/compiler/igvn/TestMinMaxIdentity.java`
> 14. 
> `jtreg:test/hotspot/jtreg/compiler/intrinsics/float16/TestFloat16MaxMinSpecialValues.java`
> 15. 
> `jtreg:test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxReductions.java`
> 
> Finally, the JMH micro-benchmarks listed below were updated to ensure all 
> code paths are exercised.
> 
> 1. `...

Mohamed Issa has updated the pull request with a new target base due to a merge 
or a rebase. The pull request now contains 10 commits:

 - Merge branch 'master' into user/missa-prime/avx10_2
 - Rename instructions to match AVX512 and AVX10 conventions while also 
updating some code generation paths.
 - Rename IR match nodes in tests to account for merging of scalar min/max 
definitions in x86.ad file.
 - Refactor some of the code and re-introduce some instructions previously 
eliminated while also adding new ones.
 - Merge branch 'master' into user/missa-prime/avx10_2
 - Merge branch 'master' into user/missa-prime/avx10_2
 - Remove half precision min/max reduction definitions and adjust corresponding 
benchmarks.
 - Use alternative instruction flow for half precision reduction loops and add 
supporting infrastructure.
 - Merge branch 'master' into user/missa-prime/avx10_2
 - Replace scalar AVX10.2 floating point min/max instructions with more 
efficient sequence

-------------

Changes: https://git.openjdk.org/jdk/pull/29831/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29831&range=05
  Stats: 754 lines in 11 files changed: 433 ins; 128 del; 193 mod
  Patch: https://git.openjdk.org/jdk/pull/29831.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/29831/head:pull/29831

PR: https://git.openjdk.org/jdk/pull/29831

Reply via email to