2011/11/3 Georg-Johann Lay <a...@gjlay.de>: > This is a tweak for signed 16- and 32-bit division routines. > The old code called subroutine __divmod{si|hi}4_neg1 and returned if T-flag is > not set. This is costly. By shuffling the instructions the test can be moved > up without increasing the code size but saving calls here and there. > > The speed gain is 1..17 ticks for ATmega88 which is a speed-up of up to 7% for > 16-bit division (formerly about 230-240 ticks). For 16-bit division the > absolute speed gain is the same. > > Moreover, addqi3 can handle +/-2 now which saves reload if the constant in > non-d register. The new *negqihi2 insn is for code like > > int minus (char a) > { > return -a; > } > > that compiled to > > minus: > clr r25 ; 6 extendqihi2/1 [length = 3] > sbrc r24,7 > com r25 > com r25 ; 7 neghi2/1 [length = 3] > neg r24 > sbci r25,lo8(-1) > ret ; 25 return [length = 1] > > and now is compiled to a shorter, faster sequence without need of d-register: > > minus: > clr r25 ; 7 *negqihi2 [length = 4] > neg r24 > brge .+2 > com r25 > ret ; 25 return [length = 1] > > Tested without regressions. Moreover, the new sequences are tested > individually > against the old code. > > The patch is against the old infrastructure but the changelog is already for > the new libgcc layout. > > Ok for trunk? > > Johann > > gcc/ > * config/avr/constraints.md (Cm2): New constraint for int -2. > * config/avr/avr.md (addqi3): Use it. New alternatives for +/-2. > (*negqihi2): New insn. > libgcc/ > * config/avr/lib1funcs.S (__divmodhi4, __divmodsi4): Tweak speed. >
Approved. Denis.