> Hi all, > > This patch optimizes SIMD kernels making heavy use of broadcasted inputs > through following reassociating ideal transformations. > > > VectorOperation (VectorBroadcast INP1, VectorBroadcast INP2) => > VectorBroadcast (ScalarOpration INP1, INP2) > > VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast > INP2) INP3) => > VectorOperation INP3 (VectorOperation > (VectorBroadcast INP1) (VectorOperation INP2)) > > > The idea is to push broadcasts across the vector operation and replace the > vector with an equivalent, cheaper scalar variant. Currently, patch handles > most common vector operations. > > Following are the performance number of benchmark included with this patch on > latest generation x86 targets:- > > **AMD Turin (2.1GHz)** > <img width="1122" height="355" alt="image" > src="https://github.com/user-attachments/assets/3f5087bf-0e14-4c56-b0c2-3d23253bad54" > /> > > **Intel Granite Rapids (2.1GHz)** > <img width="1105" height="325" alt="image" > src="https://github.com/user-attachments/assets/c8481f86-4db2-4c4e-bd65-51542c59fe63" > /> > > > > Kindly review and share your feedback. > > Best Regards, > Jatin
Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25617/files - new: https://git.openjdk.org/jdk/pull/25617/files/eb53527c..2a044457 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25617&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25617&range=00-01 Stats: 133 lines in 3 files changed: 110 ins; 10 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/25617.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25617/head:pull/25617 PR: https://git.openjdk.org/jdk/pull/25617
