https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109441
Bug ID: 109441 Summary: missed optimization when all elements of vector are known Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hiraditya at msn dot com Target Milestone: --- Reference: https://godbolt.org/z/af4x6zhz9 When all elements of vector are 0, then the compiler should be able to remove the loop and just return 0. Testcase: #include<vector> using namespace std; using T = int; T v() { T s; std::vector<T> v; v.resize(1000, 0); for (auto i = 0; i < v.size(); ++i) { s += v[i]; } return s; } $ g++ -O3 -std=c++17 .LC0: .string "vector::_M_fill_insert" v(): push rbx pxor xmm0, xmm0 mov edx, 1000 xor esi, esi sub rsp, 48 lea rcx, [rsp+12] lea rdi, [rsp+16] mov QWORD PTR [rsp+32], 0 mov DWORD PTR [rsp+12], 0 movaps XMMWORD PTR [rsp+16], xmm0 call std::vector<int, std::allocator<int> >::_M_fill_insert(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, unsigned long, int const&) mov rdx, QWORD PTR [rsp+24] mov rdi, QWORD PTR [rsp+16] mov rax, rdx sub rax, rdi mov rsi, rax sar rsi, 2 cmp rdx, rdi je .L99 test rax, rax mov ecx, 1 cmovne rcx, rsi cmp rax, 12 jbe .L107 mov rdx, rcx pxor xmm0, xmm0 mov rax, rdi shr rdx, 2 sal rdx, 4 add rdx, rdi .L101: movdqu xmm2, XMMWORD PTR [rax] add rax, 16 paddd xmm0, xmm2 cmp rdx, rax jne .L101 movdqa xmm1, xmm0 psrldq xmm1, 8 paddd xmm0, xmm1 movdqa xmm1, xmm0 psrldq xmm1, 4 paddd xmm0, xmm1 movd ebx, xmm0 test cl, 3 je .L99 and rcx, -4 mov eax, ecx .L100: lea edx, [rax+1] add ebx, DWORD PTR [rdi+rcx*4] movsx rdx, edx cmp rdx, rsi jnb .L99 add eax, 2 lea rcx, [0+rdx*4] add ebx, DWORD PTR [rdi+rdx*4] cdqe cmp rax, rsi jnb .L99 add ebx, DWORD PTR [rdi+4+rcx] .L99: test rdi, rdi je .L98 mov rsi, QWORD PTR [rsp+32] sub rsi, rdi call operator delete(void*, unsigned long) .L98: add rsp, 48 mov eax, ebx pop rbx ret .L107: xor eax, eax xor ecx, ecx jmp .L100 mov rbx, rax jmp .L105 v() [clone .cold]: