On Sunday 04 January 2009 01:57:05 Jason Martin wrote: > Fair enough. I don't have any strong opinions on the matter. I > should probably update the Intel code to include your popcount routine > since the new Intel cores support the SSE 4.1 instruction for popcount > on the xmm registers. > > --jason
It should run no problem , although at what speed!!! I'm going to put the k10 specific stuff in below x86_64/ x86_64/core2/ x86_64/amd64/ x86_64/amd64/k10/ > > On Sat, Jan 3, 2009 at 8:02 PM, <ja...@njkfrudils.plus.com> wrote: > > On Sunday 04 January 2009 00:57:44 ja...@njkfrudils.plus.com wrote: > >> On Sunday 04 January 2009 00:36:46 Jason Martin wrote: > >> > Alternatively, we could stop trying to identify chips by marketing > >> > brands and just use the values returned by CPUID. This would create a > >> > lot of duplicated code in sub-directories, but disk space is cheap. > >> > So, would something like: > >> > > >> > mpn/x86_64/<vendor>/<extended family number/model> > >> > > >> > work for our configuration? > >> > >> As most of the models are the same , this seems like a waste. > >> Also this assumes that CPUID is the only differentiator , what about > >> L2-cachesize (in the future?) , GPU coprocessors > >> > >> How about , for each asm file a description of minimum requirements > >> eg > >> add_n.asm requires x86_64,LAHF > >> lshift.adm requires x86_64,SSE4.2 > >> hamdist.asm requires x86_64,popcnt > >> > >> and we only bother with the differences that we use , ie virtualization > >> instructions we dont bother with. This doesn't help with selecting among > >> functions that run at different speeds. > >> > >> I think what happens at the moment is nearly the best. > > > > Whoops , didn't finish .... > > > > When we get a function which splits an existing type into two(or more) > > subtypes then we duplicate the existing functions between the subtypes , > > and put the new function_1 into subtype1 and new function_2 into subtype2 > > > >> > Jason Worth Martin > >> > Asst. Professor of Mathematics > >> > http://www.math.jmu.edu/~martin > >> > > >> > On Sat, Jan 3, 2009 at 5:49 PM, Bill Hart > >> > <goodwillh...@googlemail.com> > >> > >> wrote: > >> > > I think that features such as SSE should be tested for after testing > >> > > for the main chip core. So under /mpn/x86_64/k8 you'd have > >> > > directories for any features not available on all k8's. > >> > > > >> > > Bill. > >> > > > >> > > 2009/1/3 mabshoff <michael.absh...@mathematik.uni-dortmund.de>: > >> > >> On Jan 3, 2:25 pm, jason <ja...@njkfrudils.plus.com> wrote: > >> > >>> On Jan 3, 9:00 am, "Bill Hart" <goodwillh...@googlemail.com> > >> > >>> wrote: > >> > >> > >> > >> Hi, > >> > >> > >> > >>> > The new intel machines. And I don't know if all Dunnington's use > >> > >>> > the same family/system CPUID etc. So there might be mutiple > >> > >>> > CPUID's we need to add to config.guess. > >> > >>> > > >> > >>> > Bill. > >> > >>> > >> > >>> We should change the lowest common denominator on a x86_64 system > >> > >>> to something more useful than 486 , say P4 64bit without LAHF ? , > >> > >>> then people can at least get mpir working on new machines without > >> > >>> mucking about > >> > >> > >> > >> Well, the trouble was that configure believed it was a 32 bit > >> > >> system, so I don't see much we can do there aside from attempting > >> > >> to compile things in 64 bit mode. > >> > >> > >> > >>> For the K10 , we will need a separate directory for it , I have > >> > >>> mpn_popcount and mpn_hamdist which will not run on the K8 , > >> > >>> requires SSE4.1a or whatever it's called ... > >> > >>> before 7/7.75 c/l now 1.5/1.75 c/l > >> > >> > >> > >> Wouldn't it be better to create a SSE4.1a directory and use that > >> > >> assembly code when SSE 4.1a is available? That seems to be the > >> > >> prevailing way to do things. > >> > >> > >> > >> On second though: according to http://en.wikipedia.org/wiki/SSE4 it > >> > >> seems that there are three SSE4 flavors: > >> > >> > >> > >> * SSE 4.1 > >> > >> * SSE 4.2 > >> > >> * SSE 4.1a > >> > >> > >> > >> The last one seems to be K10 specific for now, but I would still > >> > >> recommend to test for SSE 4.1a if your code is that specific. > >> > >> > >> > >> <SNIP> > >> > >> > >> > >> Cheers, > >> > >> > >> > >> Michael > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---