https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106594
--- Comment #5 from Roger Sayle <roger at nextmovesoftware dot com> --- Hi Tamar, I think this is where I need to apologize. Combine is now canonicalizing these equivalent RTL expressions to the zero_extend form, on the assumption that zero extension has no data dependency and is cheaper or at worst the same speed on many targets. Unfortunately, for aarch64 there are patterns (splitters or peephole2s) for optimizing the sign_extend version that don't exist for the zero_extend version [even though the instruction set is symmetric and should handle both sxtw/uxtw]. Technically, these were just missed-optimizations before, but I'm guessing my changes (to both trees and RTL) lead to changes in the form that the backend encounters, and leads to a code quality regression. This should be easy to fix, I just need to get up to speed on the instructions that aarch64 supports, and which zero extended forms are currently missing. I'm sure if GCC instead canonicalized to the sign_extend form, that other targets would show similar asymmetries (it's only when things change that anyone notices the difference). I'll see if I can come up with a fix over the weekend.