https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85116
Bug ID: 85116 Summary: std::min_element does not optimize well with inlined predicate Product: gcc Version: 7.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: christopher.schell at oculus dot com Target Milestone: --- According to godbolt (https://godbolt.org/g/igzsnL), the following code: #define SIZE 1000 std::array<double, SIZE> testArray; int getMinIdxCPPStyle(double offset) { auto minElement = std::min_element(std::cbegin(testArray), std::cend(testArray), [offset](auto a, auto b) { return std::abs(a - offset) < std::abs(b - offset); }); return std::distance(std::cbegin(testArray), minElement ); } generates as the following under -O3 getMinIdxCPPStyle(double): movq xmm3, QWORD PTR .LC1[rip] mov eax, OFFSET FLAT:testArray mov edx, OFFSET FLAT:testArray+8 .L11: movsd xmm1, QWORD PTR [rdx] movsd xmm2, QWORD PTR [rax] subsd xmm1, xmm0 subsd xmm2, xmm0 andpd xmm1, xmm3 andpd xmm2, xmm3 ucomisd xmm2, xmm1 cmova rax, rdx add rdx, 8 cmp rdx, OFFSET FLAT:testArray+8000 jne .L11 sub rax, OFFSET FLAT:testArray sar rax, 3 ret The problem being that the typical c-style loop beats this easily due to caching the minimum value and not fetching it and recomputing it. Is there a reason that the generated code should not cache the minimum value in a register instead of probably causing a cache miss by fetching it and then unnecessarily running the computations on it again?