That should say "your block2". Typo, sorry. 2008/11/23 Bill Hart <[EMAIL PROTECTED]>: > It seems like unrolling our block2 by 2 could be made optimal in > theory. You need 2 slots for the loop control. There are 14 slots in > your block2. > > 2*14 + 2 = 30. > > That would give 10/4 = 2.5c/l. > > By the way, you suggest that perhaps moving the loop control up might > help. If the processor has out-of-order capability, why would this > help? Is there something else that prevents that from executing > earlier regardless? > > Bill. > > 2008/11/23 <[EMAIL PROTECTED]>: >> >> On Sunday 23 November 2008 18:53:46 Bill Hart wrote: >>> That's very impressive! >>> >>> What do you mean by a slot? >>> >> >> whatever I want it to mean!! seeing how vague most asm docs are. >> >> A macro-op . >> >> >>> I presume by ax you mean rax, etc. >>> >> >> yeah , just lazy >> >>> There's also going to be some loop overhead right? >> >> included allready >> >> Just been thinking about the timings I got from the different unrolling. >> >> mov $0,%r9 >> mul %rcx >> add %rax,%r8 >> mov 8(%rsi,%rbx,8),%rax >> adc %rdx,%r9 >> mov %r8,(%rdi,%rbx,8) >> mul %rcx >> mov $0,%r10 >> add %rax,%r9 >> mov 16(%rsi,%rbx,8),%rax >> adc %rdx,%r10 >> mov %r9,8(%rdi,%rbx,8) >> >> above a "basic block2" for two limbs , this is just two of the "basic >> block" >> before , stuck together , but with the loads shifted up . >> >> If we assume this runs in 5 cycles , and the loop control take an extra 1 >> cycle then >> unroll by 2 is (1*5+1)/2=3 c/l >> unroll by 4 is (2*5+1)/4=2.75 c/l >> unroll by 8 is (4*5+1)/8=2.625 c/l >> unroll by 16 is (8*5+1)/16=2.5625 c/l >> >> which matches my timings exactly plus a little bit >> >> the loop control is >> add $4,%ebx >> jnz loop >> >> which is two slots , and each "basic block2" has a spare slot , maybe putting >> the loop control at a different place (say after the mul?) , will make a >> difference. >> >> >> >> >>> >>> Bill. >> >> >> >> >> >
--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---