https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112268

            Bug ID: 112268
           Summary: AVR-GCC generates suboptimal code for bit shifts
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: anton at tchekov dot net
  Target Milestone: ---

This function:

uint8_t extract1(uint32_t val)
{
    return val >> 26;
}

generates the following assembly code, which shifts all four registers 26 times
in a loop.
It is exactly the same on optimization levels O2, O3 and Os. (The exact shift
amounts are not important)

extract1:
        mov r27,r25
        mov r26,r24
        mov r25,r23
        mov r24,r22
        ldi r18,26
        1:
        lsr r27
        ror r26
        ror r25
        ror r24
        dec r18
        brne 1b
        ret

It is possible to do a lot better with this workaround which uses only 3
instructions:
(and is exactly equivalent)

uint8_t extract2(uint32_t val)
{
    uint8_t tmp = val >> 24;
    return tmp >> 2;
}

extract2:
        mov r24,r25
        lsr r24
        lsr r24
        ret

The "shift loop" only happens with 32-bit integers, but not with 16-bit, where
the optimization opportunity is recognized:

uint8_t extract3(uint16_t val)
{
        return val >> 9;
}

extract3:
        mov r24,r25
        lsr r24
        ret

Reply via email to