Re: [avr-libc-dev] [bug #12739] Gcc assumes that target libc providesffs function

David Brown Thu, 21 Apr 2005 01:48:56 -0700

> As Dmitry K. wrote:
>
> > I have look 'ffs' from newlib (1.12.0).  It take up to 800 (!)
> > clocks.  Reason: shift is included to loop.
>
> The slightly modified BSD code (counter reduced to 8 bits) yields
>
>         sbiw r24,0
>         breq .L1
>         ldi r18,lo8(1)
> .L9:
>         sbrc r24,0
>         rjmp .L8
>         lsr r25
>         ror r24
>         subi r18,lo8(-(1))
>         rjmp .L9
> .L8:
>         mov r24,r18
>         clr r25
> .L1:
>         ret
>
> This makes 6 clocks per cycle, so up to ~ 100 clocks max.
>
> I'm interested in seeing Dmitry's code...
> --
> cheers, J"org               .-.-.   --... ...--   -.. .  DL8DTL
>


Just for fun, I tried implementing ffs in pure assembler.  I haven't tested
this code, so I might have made a silly mistake somewhere, or counted cycles
incorrectly.  However, it has a number of features - by only looking at
either the upper or lower byte (as appropriate), the inner loop is 4 cycles
instead of 6, and the maximum number of iterations is 8.  It only uses r24
and r25, with no temporary registers - that might be an advantage if the
function were to be inlined?  It's not quite as good as Dmitry's, being 13
words and a maximum of 43 cycles:

.global ffs
ffs:
 subi r24, 0
 breq .L1  ; Lo byte 0
 mov r25, r24
 clr r24
 rjmp .L2
.L1:
 subi r25, 0
 breq .L3
 ldi r24, 8 ; Start with offset of 8
.L2:
 inc r24
 asr r25
 brcc .L2
 clr r25
.L3:
 ret

Words = 13
Cycles (for a result of "n") :
     0          6 + 5
    1..8       6 + n*4 + 5
    9..16     6 + (n-8)*4 + 5

Max = 43


mvh.,

David




_______________________________________________
AVR-libc-dev mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/avr-libc-dev

Re: [avr-libc-dev] [bug #12739] Gcc assumes that target libc providesffs function

Reply via email to