On Wed, Dec 12, 2007 at 11:51:20PM +0530, Rohit Garg wrote: > Hi all, > > I was following the separate discussion on this list about writing > various trig functions using vector intrinsics. I googled for it. The > top few results I got were for "old" processors when SIMD intrinsics > were new. The gcc documentation (my version is 4.1.2) has a list of > intrinsics but no description, not even one line per intrinsic.
I believe those are 1-to-1 with the actual machine instructions. See the intel or AMD docs. > As there is need to optimize the codebase for new processors (conroe, > barcelona etc) any way, can you please point me to some real > documenatation on the subject. I would really appreciate any help. I'm not sure exactly what you're looking for. Both intel and AMD have manuals about optimizing code for their microarchitectures. You'll find them somewhere on their developer sites. Probably the biggest place that needs improvement is trig functions. I suggest starting with sin(x), cos(x) and sincos(x) for x a scalar float, and a related version that computes 4 in parallel for x a vector of 4 floats. I'd do two versions of each: SSE2 for x86 and SSE2 for x86_64 (on the 64 you've got twice as many registers to work with.) We need them with something close to single-precision floating point accuracy. You'll need to figure out what input domain you're willing to accept; I'd say at a minimum +/- 4*pi. > As a related question, possibly a digression, given that these > extensions are the key to unlock full power of new processors and yet > are rather low level (we are still writing trig funcs), is there any > FLOSS library for simd math? Not sure. Please check it out and let us know what you find. There is of course the ATLAS stuff (optimized BLAS). Eric _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio