Richard Henderson <r...@twiddle.net> writes:

  On 04/23/12 07:49, Torbjorn Granlund wrote:
  > Do you know the repeat rate of umull, umlal, umaal, assuming no reg
  > dependencies?
  
  For a8: 3 cycles.
  
For a9 it seems to be 2 cycles, so 3.25 c/l for the current addmul_1 is
not very good.

I have found no timing docs, so I measured it myself:

        .text
        .global main
main:   push    {r4-r8}
        mov     r12, #0x3b800000

1:      subs            r12, r12, #1
        umaal   r0, r1, r14, r14
        umaal   r2, r3, r14, r14
        umaal   r4, r5, r14, r14
        umaal   r6, r7, r14, r14
        bne             1b

        pop     {r4-r8}
        bx              lr

This loop takes about 9 cycles, or 2.25 cycles per umaal.

The latency is 3 cycles (found by using r0,r1 for every umaal above).

-- 
Torbjörn
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
http://gmplib.org/mailman/listinfo/gmp-devel

Reply via email to