Instead of jumping to a place that ROLs r_arg1 (with C=0),
LSL r_arg1 can be performed prior to the loop.  This
reduces the number of loopings from 9 to 8.

Applied as obvious.

Johann

AVR: target/114794 - Tweak __udivmodqi4

libgcc/
        PR target/114794
        * config/avr/lib1funcs.S (__udivmodqi4): Tweak.

diff --git a/libgcc/config/avr/lib1funcs.S b/libgcc/config/avr/lib1funcs.S
index 535510ab867..af4d7d97016 100644
--- a/libgcc/config/avr/lib1funcs.S
+++ b/libgcc/config/avr/lib1funcs.S
@@ -1339,9 +1339,9 @@ DEFUN __umulsidi3

 #if defined (L_udivmodqi4)
 DEFUN __udivmodqi4
-       sub     r_rem,r_rem     ; clear remainder and carry
-       ldi     r_cnt,9         ; init loop counter
-       rjmp    __udivmodqi4_ep ; jump to entry point
+       clr     r_rem           ; clear remainder
+       ldi     r_cnt,8         ; init loop counter
+       lsl     r_arg1          ; shift dividend
 __udivmodqi4_loop:
        rol     r_rem           ; shift dividend into remainder
        cp      r_rem,r_arg2    ; compare remainder & divisor

Reply via email to