Richard Henderson <r...@twiddle.net> writes: On 04/23/12 07:49, Torbjorn Granlund wrote: > Do you know the repeat rate of umull, umlal, umaal, assuming no reg > dependencies? For a8: 3 cycles. For a9 it seems to be 2 cycles, so 3.25 c/l for the current addmul_1 is not very good.
I have found no timing docs, so I measured it myself: .text .global main main: push {r4-r8} mov r12, #0x3b800000 1: subs r12, r12, #1 umaal r0, r1, r14, r14 umaal r2, r3, r14, r14 umaal r4, r5, r14, r14 umaal r6, r7, r14, r14 bne 1b pop {r4-r8} bx lr This loop takes about 9 cycles, or 2.25 cycles per umaal. The latency is 3 cycles (found by using r0,r1 for every umaal above). -- Torbjörn _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel