https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83565
Jim Wilson <wilson at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wilson at gcc dot gnu.org --- Comment #11 from Jim Wilson <wilson at gcc dot gnu.org> --- On itanium, when you take a paradoxical subreg, the upper bits are undefined. Note for instance that we use the exact same instruction for addsi3 and adddi3. So after an addsi3, the upper bits may be garbage because the add instruction writes to all 64 bits. It is the compare instruction that matters. We have separate cmp4 for SImode and cmp for DImode, where cmp4 ignores the upper bits of the register because they are garbage bits. This part >r358 is known to have the high half zero: > 22: r358:DI=r357:SI#0^r341:SI#0 is wrong. The upper bits of both registers are unknown bits, and xor of unknown bits does not return zero. I think the problem is in nonzero_bits1 in rtlanal.c, in the SUBREG case. On ia64, we have WORD_REGISTER_OPERATIONS true, and load_extend_op of SImode is ZERO_EXTEND, so the code decides that the upper bits must be zero. But LOAD_EXTEND_OP only applies to memory operations, and we do not have a subreg of memory here, we have a subreg of a register. When we have a subreg of a reg, on itanium, the upper bits are unknown. So the optimization performed here is wrong. Though one could perhaps argue that the ia64 port is using WORD_REGISTER_OPERATIONS wrong I suppose. I'm not sure what would happen if we removed that define. Anyways, since addsi3 and many other simode patterns leave the upper bits as garbage, it doesn't make sense to argue that rotate patterns must zero the upper bits.