https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82888
Bug ID: 82888 Summary: terrible code generation for initialization of POD array members vs. clang Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: froydnj at gcc dot gnu.org CC: jseward at acm dot org, mh+gcc at glandium dot org Target Milestone: --- Consider the testcase: class A { public: A(); private: unsigned char mStorage[4096]; }; A::A() : mStorage() {} gcc -O2 generates a loop that looks like: .L2: movb $0, (%rdi) addq $1, %rdi cmpq %rdi, %rax jne .L2 which is terribly slow. (The original motivation for this bug report came from an -Og compilation, where you get: movl $4095, %eax .L3: testq %rax, %rax js .L1 movb $0, (%rdi) addq $1, %rdi subq $1, %rax jmp .L3 which is even worse.) clang -O2 (or -O1), on the other hand, generates: xorl %esi, %esi movl $4096, %edx # imm = 0x1000 jmp memset # TAILCALL which is ideal, and opens up opportunities for the backend to lower the memset to something intelligent (e.g. rep stos)