https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90622

            Bug ID: 90622
           Summary: Suboptimal code generated for
                    __builtin_avr_insert_bits
           Product: gcc
           Version: 5.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: igusarov at mail dot ru
  Target Milestone: ---

Please consider the following function:

    uint8_t copy_bit_5_to_bit_2(uint8_t dst, uint8_t src)
    {
        return __builtin_avr_insert_bits(0xFFFFF5FF, src, dst);
    }

That particular map value (magic hex constant) is supposed to copy the 5-th bit
from argument 'src' to the 2-nd bit of argument 'dst' while leaving all other
bits of src unmodified.

In other words, given that
bit representation of src is [s7 s6 s5 s4 s3 s2 s1 s0], and
bit representation of dst is [d7 d6 d5 d4 d3 d2 d1 d0],
it should return [d7 d6 d5 d4 d3 s5 d1 d0].

The code generated for such function is perfect:

    bst r22,5    # Take the 5-th bit of r22
    bld r24,2    # Put it as the 2-nd bit in r24

Similar code is generated for copying any n-th bit to any m-th bit, provided
that n and m are different. Thus far everything is great.

However, the code generated for copying n-th bit to n-th bit is surprisingly
suboptimal. A similar function

    uint8_t copy_bit_2_to_bit_2(uint8_t dst, uint8_t src)
    {
        return __builtin_avr_insert_bits(0xFFFFF2FF, src, dst);
    }

gives:

    eor r22,r24
    andi r22,lo8(4)
    eor r24,r22

which takes an extra word of program memory and an extra CPU cycle at runtime.
I wonder what's wrong with using the same bst/bld idiom which is successfully
used for n-to-m copy? I would expect that the following code is much better:

    bst r22,2
    bld r24,2

It would be great if the compiler can generate it.

Reply via email to