[PATCH] aarch64: Don't include vec_select in SIMD multiply cost

Jonathan Wright via Gcc-patches Tue, 20 Jul 2021 03:47:34 -0700

Hi,

The Neon multiply/multiply-accumulate/multiply-subtract instructions
can take various forms - multiplying full vector registers of values
or multiplying one vector by a single element of another. Regardless
of the form used, these instructions have the same cost, and this
should be reflected by the RTL cost function.


This patch adds RTL tree traversal in the Neon multiply cost function
to match the vec_select used by the lane-referencing forms of the
instructions already mentioned. This traversal prevents the cost of
the vec_select from being added into the cost of the multiply -
meaning that these instructions can now be emitted in the combine
pass as they are no longer deemed prohibitively expensive.

Regression tested and bootstrapped on aarch64-none-linux-gnu - no
issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

2021-07-19  Jonathan Wright  <jonathan.wri...@arm.com>

        * config/aarch64/aarch64.c (aarch64_rtx_mult_cost): Traverse
        RTL tree to prevents vec_select from being added into Neon
        multiply cost.

rb14675.patch
Description: rb14675.patch

[PATCH] aarch64: Don't include vec_select in SIMD multiply cost

Reply via email to