https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82447
Bug ID: 82447 Summary: Consider removing cmp instruction while iterating on an array of known bound Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: antoshkka at gmail dot com Target Milestone: --- Following code unsigned loop_read(unsigned char* a) { const unsigned size = 128; unsigned sum = 0; for (unsigned i = 0; i < size; ++i) { sum += a[i]; } return sum; } Generates assembly: loop_read(unsigned char*): lea rcx, [rdi+128] xor eax, eax .L2: movzx edx, BYTE PTR [rdi] add rdi, 1 add eax, edx cmp rdi, rcx <=== This could be avoided jne .L2 rep ret The trick is to iterate from -128 to 0 and calling "jne .L2" right after the increment. Here's how Clang does that: loop_read(unsigned char*): # @loop_read(unsigned char*) xor eax, eax mov rcx, -128 .LBB0_1: # =>This Inner Loop Header: Depth=1 movzx edx, byte ptr [rdi + rcx + 128] add eax, edx add rcx, 1 jne .LBB0_1 ret