That should say "your block2". Typo, sorry.

2008/11/23 Bill Hart <[EMAIL PROTECTED]>:
> It seems like unrolling our block2 by 2 could be made optimal in
> theory. You need 2 slots for the loop control. There are 14 slots in
> your block2.
>
> 2*14 + 2 = 30.
>
> That would give 10/4 = 2.5c/l.
>
> By the way, you suggest that perhaps moving the loop control up might
> help. If the processor has out-of-order capability, why would this
> help? Is there something else that prevents that from executing
> earlier regardless?
>
> Bill.
>
> 2008/11/23  <[EMAIL PROTECTED]>:
>>
>> On Sunday 23 November 2008 18:53:46 Bill Hart wrote:
>>> That's very impressive!
>>>
>>> What do you mean by a slot?
>>>
>>
>> whatever I want it to mean!! seeing how vague most asm docs are.
>>
>> A macro-op .
>>
>>
>>> I presume by ax you mean rax, etc.
>>>
>>
>> yeah , just lazy
>>
>>> There's also going to be some loop overhead right?
>>
>> included allready
>>
>> Just been thinking about the timings I got from the different unrolling.
>>
>> mov $0,%r9
>> mul %rcx
>> add %rax,%r8
>> mov 8(%rsi,%rbx,8),%rax
>> adc %rdx,%r9
>> mov %r8,(%rdi,%rbx,8)
>> mul %rcx
>> mov $0,%r10
>> add %rax,%r9
>> mov 16(%rsi,%rbx,8),%rax
>> adc %rdx,%r10
>> mov %r9,8(%rdi,%rbx,8)
>>
>> above a "basic block2" for  two limbs , this is just two of the  "basic 
>> block"
>> before , stuck together , but with the loads shifted up .
>>
>> If we assume this runs in 5 cycles  , and the loop control take an extra 1
>> cycle then
>> unroll by 2 is (1*5+1)/2=3 c/l
>> unroll by 4 is (2*5+1)/4=2.75 c/l
>> unroll by 8 is (4*5+1)/8=2.625 c/l
>> unroll by 16 is (8*5+1)/16=2.5625 c/l
>>
>> which matches my timings exactly plus a little bit
>>
>> the loop control is
>> add $4,%ebx
>> jnz loop
>>
>> which is two slots , and each "basic block2" has a spare slot , maybe putting
>> the loop control at a different place (say after the mul?)  , will make a
>> difference.
>>
>>
>>
>>
>>>
>>> Bill.
>>
>>
>> >>
>>
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to