On Mar 3, 9:59 pm, Bill Hart <goodwillh...@googlemail.com> wrote:
> Hi Brian,
>
> could you describe what we are looking at in that graph. What are the
> machines 1, 2, 3 and 4 in order and what do the horizontal and
> vertical axes stand for?
>
> I'd be really surprised if there are any differences in the
> instruction scheduling of your machine and the 65 nm machine (martinj)
> though I can't prove it with an actual document.
>
> There's lots online about how the jump from core duo to core 2 duo on
> mobile platforms was not as great as for desktops. But the number
> crunching abilities of the Core 2 duo mobile seem to be not in doubt.
>
> If the issues are for very small sized integers, then I don't see a
> hardware architectural reason for it.
>
> Bill.
>
> 2009/3/3 Cactus <rieman...@googlemail.com>:
>
>
>
> > On Mar 3, 3:32 pm, Cactus <rieman...@googlemail.com> wrote:
> >> On Mar 3, 3:07 pm, Jeff Gilchrist <jeff.gilchr...@gmail.com> wrote:
>
> >> > On Tue, Mar 3, 2009 at 9:58 AM, Bill Hart <goodwillh...@googlemail.com> 
> >> > wrote:
> >> > > So I hope we are not just looking at architecture differences
> >> > > between your core2 and that machine here.
>
> >> > There are architectural changes between the 65nm and 45nm Core2
> >> > processors, and the Xeon's tend to have much larger cache sizes so
> >> > that might be playing a role.
>
> >> > Brian, what are the specs for your Core 2 machine?  I will see if I
> >> > can find a Core2 machine that I can test both the Linux code and
> >> > Windows code on the same system.  I'm not sure where to grab the
> >> > mpirbench or the cycle test code, I don't seem to see it anywhere
> >> > obvious inside trunk/.
>
> >> My machine is a Dell M65 portable as follows:
>
> >> Intel Mobile Core 2 Duo T7400 running at 2.16GHz
>
> >> Merom at 65nm, family 6, model F, stepping 6
>
> >> L1 2 x 32 KBytes data
> >> L1 2 x 3 2KBytes instructions
> >> L2 4096 KBytes
>
> >> Memory 2GBytes DDR2
>
> >>     Brian
>
> > I have now graphed results for mpn_add_n on four systems: JM and my
> > K8's, Bill's Linux Core2 and my Windows Core2.   I have uploaded
> > mpn_add_n.pdf with the results which show that my Windows results are
> > anomalous.  K8 -> 1.7 cycles/limb, Linux Core2 -> 2.3 cycles/limb,
> > Windows Core2 -> 4.3 cycles/limb.
>
> > What is odd is that the assembler code files for this function are
> > identical in the AMD and Core2 builds. And if I run my AMD binary on
> > my Core2 it runs at 4.3 cycles/limb.
>
> > So is this something to do with the way mobile Core2 machines are
> > configured to cut power when not under load?
>
> > Does power saving cause this sort of effect - the difference in the
> > results is much smaller when the high cost benchmarks are run (mul and
> > sqr).
> > - Show quoted text -
> >    Brian

The horizontal axis is just the number of limbs in the speed results,
the vertical axis being the cycle count reported for this number of
limbs.

The very close lower two curves are Jason's and my own K8 results. The
green line is your Core2 result and the top blue line is my Windows
Core 2 result on my Dell M65.

I am struck by how beatifully regular my core2 mpn_add_n results are
comapred with the others - its a very beautiful and elegant saw tooth
curve even if it is so slow!!

    Brian

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to