https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78687
Bug ID: 78687 Summary: inefficient code generated for eggs.variant Product: gcc Version: 6.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Created attachment 40254 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40254&action=edit gcc-eggs-variant-missing-opt.cpp I have a piece of code that actively uses library called Eggs.Variant. It is a library that implements C++17-like variant class. Profiling of this piece of code revealed that the generated code for the most expensive function is quite inefficient. Here is the generated code (with percentage of time spent in each instruction and my comments): Percent | Source code & Disassembly of a.out for cycles:pp ---------------------------------------------------------------- : ref_proxy<qual_option, inplace_ref<qual_option> > f() : { 0.00 : 400710: sub $0x140,%rsp 0.00 : 400717: mov %rdi,%rax 1.54 : 40071a: movq $0x2,0x28(%rdi) 0.00 : 400722: movl $0x0,0x128(%rsp) 22.36 : 40072d: mov 0x128(%rsp),%rdx !!! reading of stack memory immediately after writing to it 1.63 : 400735: mov %rdx,0xe8(%rsp) 0.00 : 40073d: movl $0x0,0xe8(%rsp) !!! writing 0 immediately after writing some other value to it 22.74 : 400748: mov 0xe8(%rsp),%rdx 1.59 : 400750: mov %rdx,0xa8(%rsp) 0.00 : 400758: movl $0x0,0xa8(%rsp) 22.72 : 400763: mov 0xa8(%rsp),%rdx !!! writing 0 immediately after writing some other value to it, then reading from it 2.16 : 40076b: mov %rdx,0x128(%rsp) 0.00 : 400773: mov -0x78(%rsp),%rdx 0.00 : 400778: movl $0x0,0x128(%rsp) !!! writing some value to stack memory, then writing 0 to it 0.01 : 400783: mov %rdx,(%rdi) 1.66 : 400786: mov -0x70(%rsp),%rdx 0.00 : 40078b: mov %rdx,0x8(%rdi) 0.00 : 40078f: mov -0x68(%rsp),%rdx 0.01 : 400794: mov %rdx,0x10(%rdi) 1.72 : 400798: mov -0x60(%rsp),%rdx 0.00 : 40079d: mov %rdx,0x18(%rdi) 0.00 : 4007a1: mov -0x58(%rsp),%rdx 0.02 : 4007a6: mov %rdx,0x20(%rdi) 20.15 : 4007aa: mov 0x128(%rsp),%rdx !!! again reading stack memory where 0 was written several instruction ago 1.68 : 4007b2: mov %rdx,0x30(%rdi) 0.00 : 4007b6: add $0x140,%rsp 0.00 : 4007bd: retq Initially I thought that there must be some aliasing issue. But the memory accessed is the memory of local variables. The program doesn't use volatile qualifier either. And aliasing does not explain why compiler did two writes to the same memory location in a row. As you can see this function does not little more than copying data from one memory location to another. It turned out that clang generates much better code for the function: Percent | Source code & Disassembly of a.out for cycles:pp ---------------------------------------------------------------- 31.86 : 400670: movq $0x2,0x28(%rdi) 32.27 : 400678: movl $0x0,0x30(%rdi) 0.23 : 40067f: mov %rdi,%rax 35.65 : 400682: retq Unfortunately I didn't manage to make a small snippet to reproduce the issue. The attached file is quite big (about 500 lines long), still I hope it allows reproducing the issue. The command line I used is: g++ -pthread -Wall -O2 -g -DNDEBUG -fvisibility=hidden -std=gnu++14 gcc-eggs-variant-missing-opt.cpp The compiler version was 6.2.