On Thursday 19 February 2009 00:49:16 Jason Martin wrote:
> On Wed, Feb 18, 2009 at 7:13 PM,  <ja...@njkfrudils.plus.com> wrote:
> > On Wednesday 18 February 2009 22:03:43 Mariah wrote:
> >> gmp-4.2.4   mpir-0.9.0
> >>
> >> 2241.9      2251         cicero (pentium4-pc-linux-gnu)
> >> 3371.5      3369.3      cleo (ia64-unknown-linux-gnu)
> >> 6024.5      7437.8      eno (core2-unknown-linux-gnu)
> >> 6022.2      7387.1      fulvia (core2-pc-solaris2.10)
> >> 3367.8      3369.5      iras (ia64-unknown-linux-gnu)
> >> 1341.3      1343.6      mark (ultrasparc3-sun-solaris2.10)
> >> 6100         7421.1      menas (core2-unknown-linux-gnu)
> >>
> >> Mariah
> >
> > K10 crushes core-2 (intel fanbois hide their heads in shame :)
> >
> > gmp-4.2.4       mpir-0.9.0      r1614-k8-branch
> > 6014            7379            10118                   box1
> > (k8-unknown-linux-gnu) 1.8Ghz 9301            11659           15514      
> >             cuda1 (k10-unknown-linux-gnu) 2.6Ghz
>
> I don't think that the core2 can get much faster... the addmul (and
> friends) are running just shy of 4 cycles/limb which is the max
> throughput rate for the 64-bit multiply instruction on core2.  I'm
> appropriately hiding in shame :-)
>

A lot of the speed comes from reducing the overhead in mul_basecase. The first 
mul_basecase I did that used a 2.5c/l addmul loop gave a score of 8200 .

How does a 20x20 mul_basecase compair with a 400 limb addmul_1 ?

for K8 we have
1110 cycles for 20x20
1031 cycles for 400 limb addmul
a 7.6% overhead for basecase over addmul


> 


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to