https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

            Bug ID: 95798
           Summary: Initialization code --- suboptimal
           Product: gcc
           Version: 9.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zero at smallinteger dot com
  Target Milestone: ---

Created attachment 48764
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48764&action=edit
sample code

This is similar to (but not the same as) bug 87223 for structs.  Further, this
bug expands on this issue for gcc 10.x.  Originally, this was noted in gcc
(Ubuntu 9.3.0-10ubuntu2) 9.3.0, compiling with -O3.

First, note the initialization code that trivially sets values to zero in an
array,

        mov     eax, edi
        sub     rsp, 8080
        xor     edx, edx
        and     eax, 127
        mov     QWORD PTR [rsp-120+rax*8], 0
        mov     QWORD PTR [rsp-112+rax*8], 0
        mov     QWORD PTR [rsp-104+rax*8], 0
        mov     QWORD PTR [rsp-96+rax*8], 0
        mov     QWORD PTR [rsp-88+rax*8], 0
        mov     QWORD PTR [rsp-80+rax*8], 0
        mov     QWORD PTR [rsp-72+rax*8], 0
        mov     QWORD PTR [rsp-64+rax*8], 0
        xor     eax, eax

would be better by first setting a register to zero, then writing the value of
the register.  Further, note that there is already a zero register available
(edx), but it is not used.  This is similar to 87223 for structs, and here the
issue manifests for arrays.

Second, using gcc 10 versions and -O3 at godbolt.org results in this code:

        mov     eax, edi
        mov     edx, edi
        sub     rsp, 8072
        and     eax, 127
        and     edx, 127
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+1]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+2]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+3]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+4]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+5]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+6]
        add     eax, 7
        movsx   rdx, edx
        cdqe
        mov     QWORD PTR [rsp-120+rdx*8], 0
        xor     edx, edx
        mov     QWORD PTR [rsp-120+rax*8], 0
        xor     eax, eax

This is much, much more verbose than in gcc 9.3, for no apparent gain.

Reply via email to