> Hi all,
> 
> This patch optimizes SIMD kernels making heavy use of broadcasted inputs 
> through following reassociating ideal transformations.
> 
> 
>  VectorOperation (VectorBroadcast INP1,  VectorBroadcast INP2) => 
>                             VectorBroadcast (ScalarOpration INP1, INP2)
> 
>  VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast 
> INP2) INP3) => 
>                              VectorOperation INP3 (VectorOperation 
> (VectorBroadcast INP1) (VectorOperation INP2))
> 
> 
> The idea is to push broadcasts across the vector operation and replace the 
> vector with an equivalent, cheaper scalar variant.  Currently, patch handles 
> most common vector operations.
> 
> Following are the performance number of benchmark included with this patch on 
> latest generation x86 targets:- 
> 
> **AMD Turin (2.1GHz)**
> <img width="1122" height="355" alt="image" 
> src="https://github.com/user-attachments/assets/3f5087bf-0e14-4c56-b0c2-3d23253bad54";
>  />
> 
> **Intel Granite Rapids (2.1GHz)**
> <img width="1105" height="325" alt="image" 
> src="https://github.com/user-attachments/assets/c8481f86-4db2-4c4e-bd65-51542c59fe63";
>  />
> 
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request incrementally with one additional 
commit since the last revision:

  Review comments resolutions

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25617/files
  - new: https://git.openjdk.org/jdk/pull/25617/files/eb53527c..2a044457

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25617&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25617&range=00-01

  Stats: 133 lines in 3 files changed: 110 ins; 10 del; 13 mod
  Patch: https://git.openjdk.org/jdk/pull/25617.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25617/head:pull/25617

PR: https://git.openjdk.org/jdk/pull/25617

Reply via email to