[Bug target/99434] std::bit_cast generates more instructions than __builtin_bit_cast and memcpy with -march=native

unlvsur at live dot com via Gcc-bugs Sat, 06 Mar 2021 13:36:58 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99434


--- Comment #2 from cqwrteur <unlvsur at live dot com> ---
(In reply to Andrew Pinski from comment #1)
> This is just a register allocation issue dealing with mulx and TImode.
> 
> If mulq was used instead (that is without -march=native), all of the
> functions are done correctly.

I do not think so. I think GCC generally did things like this wrong. I have
even found out how to produce different wrong results deterministically.

For example like this
https://godbolt.org/z/PbobYG

Any time it deals with things like >>32 or >>64, it produces a slower result.
This even compiles without -march=native.

While clang generates exactly the same assembly which means my result is
correct. GCC does things for this wrong.

It looks like we need more optimizations on trees for these patterns.

[Bug target/99434] std::bit_cast generates more instructions than __builtin_bit_cast and memcpy with -march=native

Reply via email to