https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86487
--- Comment #2 from avieira at gcc dot gnu.org --- I am having quite a lot of trouble understanding what is going wrong, or maybe I should say, what parts are going right. I believe it tries to match the fifth alternative for anddi3_insn here which is: '&r' 'r' 'De' This fails because of the early clobber, rightfully so because: (insn 13 11 14 2 (set (reg:DI 0 r0 [125]) (and:DI (reg:DI 1 r1 [+-4 ]) (const_int 1 [0x1]))) "../t.c":3 79 {*anddi3_insn} (nil)) DI r0 overlaps with DI r1, seeing you need two consecutive GPRs to contain a DImode. I decided to debug reload to find out why it had picked r1 and I find 'get_hard_regno' first picks r2 for (subreg:DI (SI 122)) in the same instruction. If we go up we see: (insn 10 9 11 2 (set (reg:SI 2 r2 [122]) (xor:SI (reg:SI 0 r0 [orig:123 a ] [123]) (const_int 1 [0x1]))) "../t.c":3 111 {*arm_xorsi3} (nil)) Then in 'get_hard_regno' it invokes 'subreg_regno_offset', that returns 'nregs_xmode - nregs_ymode' as offset in big endian for paradoxical subregs with offset 0, where, xmode is inner and ymode is outer. That is '-1' in our case (and always negative). So I believe reload is now seeing 'r1-r2' as the register pair for that first 'and' operand and 'r0-r1' as the destination operand. At first I was thinking this was a middle-end issue, specifically for paradoxical subregs. However, I also saw a bit of Aarch64 big endian assembly that used 'odd' registers to represent DI register pairs (V2DI). Given the comment in 'subreg_regno_offset': /* If this is a big endian paradoxical subreg, which uses more actual hard registers than the original register, we must return a negative offset so that we find the proper highpart of the register. We assume that the ordering of registers within a multi-register value has a consistent endianness: if bytes and register words have different endianness, the hard registers that make up a multi-register value must be at least word-sized. */ It made me start to think that GCC expects register pairs in big endian to be "called" by their Least Significant Register (LSR) and to be counted back from there. So '[r1, r0]' to be called (DI r1). I am not entirely sure about this though... I tried changing the arm back-end to only accept DI mode register pairs if the register is odd. That fixed this case but broke a lot of other things. I am thinking another way to fix it is to adapt Arm's 's_register_operand' to not accept paradoxical subregs in big endian, but I would first like to understand how the middle end expects/sees/generates register pairs if 'REG_WORDS_BIG_ENDIAN' is true.