http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53227

             Bug #: 53227
           Summary: [4.8 Regression] FAIL: gcc.target/i386/movbe-2.c
                    scan-assembler-times movbe[ \t] 4
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Keywords: ra
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: ubiz...@gmail.com
                CC: ber...@gcc.gnu.org, uweig...@gcc.gnu.org,
                    vmaka...@gcc.gnu.org
            Target: i686


Split from PR 53176, that changed lower-subreg to not split subregs early on
x86.

Following testcase

--cut here--
extern long long x;

void
foo (long long i)
{
  x = __builtin_bswap64 (i);
}

long long
bar ()
{
  return __builtin_bswap64 (x);
}
--cut here--

compiled with -O2 -mmovbe -m32 on x86 target triggers RA to allocate
non-optimal registers for "foo" (and forcing reload), while it is able to
allocate optimal regs for "bar" case:

bar:
        movbe   x+4, %eax
        movbe   x, %edx
        ret

The situation with foo:

foo:
        pushl   %ebx
        movl    8(%esp), %eax
        movl    12(%esp), %edx
        movl    %eax, %ebx
        movl    %edx, %ecx
        bswap   %ebx
        bswap   %ecx
        movl    %ebx, x+4
        movl    %ecx, x
        popl    %ebx
        ret

Which is a noticeable regression from 4.7:

foo:
        movbe   4(%esp), %eax
        movbe   8(%esp), %edx
        movl    %eax, x+4
        movl    %edx, x
        ret

Adding -mregparm=2 does not improve things:

foo:
        pushl   %ebx
        movl    %edx, %ecx
        movl    %eax, %ebx
        bswap   %ecx
        bswap   %ebx
        movl    %ecx, x
        movl    %ebx, x+4
        popl    %ebx
        ret

while 4.7 generates:

foo:
        movbe   %edx, x
        movbe   %eax, x+4
        ret

Reply via email to