We ran bench++ to look for c++ samples that ran slower at -O3 with
gcc-[34].x than with gcc-2.95.  We're attaching one such case,
minimized as far as we can (so it might not be testing the same
thing as the original code).  It consists of a simple function that
accesses bitfields, called in a loop from main. 
gcc-3.4.3/gcc-4.0.0/gcc-4.1-20050627 all produce binaries that seem
to be ten times slower on this than those produced by gcc-2.95.3.
All the compilers happily inlined
the function, which is fine.

Here's the code from the older compiler:

.L12:
        movb $86,%dl
        movb %dl,b_rec
        movb %dl,%al
        andb $7,%al
        cmpb $6,%al
        je .L14
        call abort
        .align 4
.L14:
        andb $240,%dl
        cmpb $80,%dl
        je .L11
        call abort
        .align 4
.L11:
        decl %ecx
        testl %ecx,%ecx
        jg .L12

And here's code from gcc-4.1-20050625:
        jmp     .L16
        .p2align 4,,7
.L27:
        andb    $-16, %dl
        cmpb    $80, %dl
        jne     .L25
        decl    %ebx
        je      .L26
.L16:
        movl    %ecx, %eax
        andl    $-8, %eax
        orl     $6, %eax
        movl    %eax, b_rec
        andb    $-9, b_rec
        movl    b_rec, %eax
        andl    $-241, %eax
        orl     $80, %eax
        movl    %eax, b_rec
        movl    %eax, %ecx
        movzbl  b_rec, %edx
        movb    %dl, %al
        andb    $7, %al
        cmpb    $6, %al
        je      .L27

We'll attach the preprocessed source.

-- 
           Summary: performance regression for gcc newer than 2.95
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: danalis at cis dot udel dot edu
                CC: gcc-bugs at gcc dot gnu dot org
GCC target triplet: i686-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22563

Reply via email to