Excellent.  I'm about to leave for the MAA-AMS Joint Meetings, so I
won't get a chance to bench your K10 stuff on a Dunnington until I get
back.  I'll report on what speeds I get.

--jason

On Sun, Jan 4, 2009 at 5:44 PM,  <ja...@njkfrudils.plus.com> wrote:
>
> On Sunday 04 January 2009 01:57:05 Jason Martin wrote:
>> Fair enough.  I don't have any strong opinions on the matter.  I
>> should probably update the Intel code to include your popcount routine
>> since the new Intel cores support the SSE 4.1 instruction for popcount
>> on the xmm registers.
>>
>> --jason
>
> It should run no problem , although at what speed!!!
>
> I'm going to put the k10 specific stuff in below
>
> x86_64/
> x86_64/core2/
> x86_64/amd64/
> x86_64/amd64/k10/
>
>
>>
>> On Sat, Jan 3, 2009 at 8:02 PM,  <ja...@njkfrudils.plus.com> wrote:
>> > On Sunday 04 January 2009 00:57:44 ja...@njkfrudils.plus.com wrote:
>> >> On Sunday 04 January 2009 00:36:46 Jason Martin wrote:
>> >> > Alternatively, we could stop trying to identify chips by marketing
>> >> > brands and just use the values returned by CPUID.  This would create a
>> >> > lot of duplicated code in sub-directories, but disk space is cheap.
>> >> > So, would something like:
>> >> >
>> >> > mpn/x86_64/<vendor>/<extended family number/model>
>> >> >
>> >> > work for our configuration?
>> >>
>> >> As most of the models are the same , this seems like a waste.
>> >> Also this assumes that CPUID is the only differentiator , what about
>> >> L2-cachesize (in the future?) , GPU coprocessors
>> >>
>> >> How about , for each asm file a description of minimum requirements
>> >> eg
>> >> add_n.asm     requires x86_64,LAHF
>> >> lshift.adm    requires x86_64,SSE4.2
>> >> hamdist.asm   requires x86_64,popcnt
>> >>
>> >> and we only bother with the differences that we use , ie virtualization
>> >> instructions we dont bother with. This doesn't help with selecting among
>> >> functions that run at different speeds.
>> >>
>> >> I think what happens at the moment is nearly the best.
>> >
>> > Whoops , didn't finish ....
>> >
>> > When we get a function which splits an existing type into two(or more)
>> > subtypes then we duplicate the existing functions between the subtypes ,
>> > and put the new function_1 into subtype1 and new function_2 into subtype2
>> >
>> >> > Jason Worth Martin
>> >> > Asst. Professor of Mathematics
>> >> > http://www.math.jmu.edu/~martin
>> >> >
>> >> > On Sat, Jan 3, 2009 at 5:49 PM, Bill Hart
>> >> > <goodwillh...@googlemail.com>
>> >>
>> >> wrote:
>> >> > > I think that features such as SSE should be tested for after testing
>> >> > > for the main chip core. So under /mpn/x86_64/k8 you'd have
>> >> > > directories for any features not available on all k8's.
>> >> > >
>> >> > > Bill.
>> >> > >
>> >> > > 2009/1/3 mabshoff <michael.absh...@mathematik.uni-dortmund.de>:
>> >> > >> On Jan 3, 2:25 pm, jason <ja...@njkfrudils.plus.com> wrote:
>> >> > >>> On Jan 3, 9:00 am, "Bill Hart" <goodwillh...@googlemail.com>
>> >> > >>> wrote:
>> >> > >>
>> >> > >> Hi,
>> >> > >>
>> >> > >>> > The new intel machines. And I don't know if all Dunnington's use
>> >> > >>> > the same family/system CPUID etc. So there might be mutiple
>> >> > >>> > CPUID's we need to add to config.guess.
>> >> > >>> >
>> >> > >>> > Bill.
>> >> > >>>
>> >> > >>> We should change the lowest common denominator on a x86_64 system
>> >> > >>> to something more useful than 486 , say P4 64bit without LAHF ? ,
>> >> > >>> then people can at least get mpir working on new machines without
>> >> > >>> mucking about
>> >> > >>
>> >> > >> Well, the trouble was that configure believed it was a 32 bit
>> >> > >> system, so I don't see much we can do there aside from attempting
>> >> > >> to compile things in 64 bit mode.
>> >> > >>
>> >> > >>> For the K10 , we will need a separate directory for it , I have
>> >> > >>> mpn_popcount and mpn_hamdist which will not run on the K8  ,
>> >> > >>> requires SSE4.1a or whatever it's called ...
>> >> > >>> before 7/7.75 c/l  now 1.5/1.75 c/l
>> >> > >>
>> >> > >> Wouldn't it be better to create a SSE4.1a directory and use that
>> >> > >> assembly code when SSE 4.1a is available? That seems to be the
>> >> > >> prevailing way to do things.
>> >> > >>
>> >> > >> On second though: according to http://en.wikipedia.org/wiki/SSE4 it
>> >> > >> seems that there are three SSE4 flavors:
>> >> > >>
>> >> > >>  * SSE 4.1
>> >> > >>  * SSE 4.2
>> >> > >>  * SSE 4.1a
>> >> > >>
>> >> > >> The last one seems to be K10 specific for now, but I would still
>> >> > >> recommend to test for SSE 4.1a if your code is that specific.
>> >> > >>
>> >> > >> <SNIP>
>> >> > >>
>> >> > >> Cheers,
>> >> > >>
>> >> > >> Michael
>>
>>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to