It seems like unrolling our block2 by 2 could be made optimal in
theory. You need 2 slots for the loop control. There are 14 slots in
your block2.

2*14 + 2 = 30.

That would give 10/4 = 2.5c/l.

By the way, you suggest that perhaps moving the loop control up might
help. If the processor has out-of-order capability, why would this
help? Is there something else that prevents that from executing
earlier regardless?

Bill.

2008/11/23  <[EMAIL PROTECTED]>:
>
> On Sunday 23 November 2008 18:53:46 Bill Hart wrote:
>> That's very impressive!
>>
>> What do you mean by a slot?
>>
>
> whatever I want it to mean!! seeing how vague most asm docs are.
>
> A macro-op .
>
>
>> I presume by ax you mean rax, etc.
>>
>
> yeah , just lazy
>
>> There's also going to be some loop overhead right?
>
> included allready
>
> Just been thinking about the timings I got from the different unrolling.
>
> mov $0,%r9
> mul %rcx
> add %rax,%r8
> mov 8(%rsi,%rbx,8),%rax
> adc %rdx,%r9
> mov %r8,(%rdi,%rbx,8)
> mul %rcx
> mov $0,%r10
> add %rax,%r9
> mov 16(%rsi,%rbx,8),%rax
> adc %rdx,%r10
> mov %r9,8(%rdi,%rbx,8)
>
> above a "basic block2" for  two limbs , this is just two of the  "basic block"
> before , stuck together , but with the loads shifted up .
>
> If we assume this runs in 5 cycles  , and the loop control take an extra 1
> cycle then
> unroll by 2 is (1*5+1)/2=3 c/l
> unroll by 4 is (2*5+1)/4=2.75 c/l
> unroll by 8 is (4*5+1)/8=2.625 c/l
> unroll by 16 is (8*5+1)/16=2.5625 c/l
>
> which matches my timings exactly plus a little bit
>
> the loop control is
> add $4,%ebx
> jnz loop
>
> which is two slots , and each "basic block2" has a spare slot , maybe putting
> the loop control at a different place (say after the mul?)  , will make a
> difference.
>
>
>
>
>>
>> Bill.
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to