[Bug target/96918] Failure to optimize vector shift left+shift right+or to pshuf

lists at coryfields dot com via Gcc-bugs Wed, 14 Jan 2026 07:24:14 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96918


--- Comment #12 from Cory Fields <lists at coryfields dot com> ---
> probably the target could advertise a rotate insn for that mode, restricted 
> to an argument of 8?

It seems this is already the case for avx512vl? There, my example above becomes
a vprold.

This missing optimization leads to a 25% slowdown for chacha20 on avx2 compared
to clang due to the pessimized 8bit/16bit rotates.

Would avx2 advertising this as a rotate be the preferred solution here? I'm not
familiar with the codebase, but I could try to implement that if so.

[Bug target/96918] Failure to optimize vector shift left+shift right+or to pshuf

Reply via email to