https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81763
--- Comment #40 from Jakub Jelinek <jakub at gcc dot gnu.org> --- (In reply to Uroš Bizjak from comment #37) > (In reply to Jakub Jelinek from comment #33) > > > and it should work. The last case would be right now: > > SI:N+1 = SI:N &~ SI:N+2; SI:N+2 = SI:N+1 &~ SI:N+3; > > and is again wrong, but we could again swap: > > SI:N+2 = SI:N+1 &~ SI:N+3; SI:N+1 = SI:N &~ SI:N+2; > > and all is fine. > > Whoops, it looks that SI:N+2 is clobbered in the swapped case. You're right. So the question is if IRA/LRA can ever allow that case where there is partial overlap with both registers. I've tried hard to simulate that case with: unsigned long long foo (unsigned long long x, unsigned long long y) { unsigned long long z; asm ("" : "+A" (x), "+Q" (y)); z = x & ~y; asm ("" : "+Q" (z) : "a" (0), "b" (0)); return z; } where IRA indeed allocates the used pseudos such that x is in ax:dx, y in cx:bx and z in dx:cx. Now, if I try this and testcase with ~x & y instead of x & ~y with GCC patched with #c36, I get: andn %eax, %ecx, %ecx xorl %eax, %eax andn %edx, %ebx, %ebx movl %ecx, %edx movl %ebx, %ecx movl %eax, %ebx resp. andn %ecx, %eax, %ecx xorl %eax, %eax andn %ebx, %edx, %ebx movl %ecx, %edx movl %ebx, %ecx movl %eax, %ebx between the two inline asms, and if I leave just the =r <- (r, r) alternative and nothing else, LRA ICEs on it (on both variants). All is with -O2 -m32 -mbmi -mstv -msse2. So, is there something in LRA that prevents these partial overlaps?