[Bug target/97127] FMA3 code transformation leads to slowdown on Skylake

crazylht at gmail dot com via Gcc-bugs Thu, 24 Sep 2020 03:46:29 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127


--- Comment #12 from Hongtao.liu <crazylht at gmail dot com> ---
Correct AVX256 load cost outside of register allocation and vectorizer

> they are
> 1. AVX256 Load  ---- 16
> 2. FMA3 ymm,ymm,ymm --- 16
> 3. AVX256 Regmove  --- 2
> 4. FMA3 mem,ymm,ymm --- 32

That's why pass_combine would combine *avx256 load* and *FMA3 ymm,ymm,ymm* to
*FMA3 mem,ymm,ymm*

[Bug target/97127] FMA3 code transformation leads to slowdown on Skylake

Reply via email to