Hi all,
This patch optimizes SIMD kernels making heavy use of broadcasted inputs
through following reassociating ideal transformations.
VectorOperation (VectorBroadcast INP1, VectorBroadcast INP2) =>
VectorBroadcast (ScalarOpration INP1, INP2)
VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast INP2)
INP3) =>
VectorOperation INP3 (VectorOperation
(VectorBroadcast INP1) (VectorOperation INP2))
The idea is to push broadcasts across the vector operation and replace the
vector with an equivalent, cheaper scalar variant. Currently, patch handles
most common vector operations.
Following are the performance number of benchmark included with this patch on
latest generation x86 targets:-
**AMD Turin (2.1GHz)**
<img width="1122" height="355" alt="image"
src="https://github.com/user-attachments/assets/3f5087bf-0e14-4c56-b0c2-3d23253bad54"
/>
**Intel Granite Rapids (2.1GHz)**
<img width="1105" height="325" alt="image"
src="https://github.com/user-attachments/assets/c8481f86-4db2-4c4e-bd65-51542c59fe63"
/>
Kindly review and share your feedback.
Best Regards,
Jatin
-------------
Commit messages:
- Adding JTREG testcase
- Clanups
- Adding a benchmark
- Adding VectorNode::Ideal on Idealazation exit
- Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8358521
- Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8358521
- Updates
- Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8358521
- Skip optimizing predicated vector operations
- 8358521: Optimize vector operations with broadcasted inputs
Changes: https://git.openjdk.org/jdk/pull/25617/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25617&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8358521
Stats: 911 lines in 6 files changed: 887 ins; 15 del; 9 mod
Patch: https://git.openjdk.org/jdk/pull/25617.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/25617/head:pull/25617
PR: https://git.openjdk.org/jdk/pull/25617