On 25 Apr 2001, at 23:33, George Woltman wrote:
> I just completed my first 512K FFT using the new SSE2 instructions!
> The 512K FFT handles exponents up to 10.3 million.
As opposed to 10.32 million with the 8087 code? If so, that's pretty
good - I thought you might have lost more precision than that.
> Timings are as follows:
>
> 1.4GHz P4, old code: 0.126 sec.
> 1.4GHz P4, new code: 0.048 sec.
> 1.2GHz Athlon, 133MHz DDR: 0.084 sec.
Not bad at all, especially if you still have PC600 memory!
SSE2 applications are still thin on the ground, and the benchmarks
used by most mags are not kind to P4s. Intel could use some good
publicity; I hope they reward you handsomely for this work, which
surely must have some impact on sales.
I reckon they owe you _at the very least_ a fully equipped Itanium
(IA64) system so that you can start to do them _another_ favour!
>
> I have a few more optimizations up my sleeve. I think my goal
> of 0.040 seconds is achievable.
How many data passes per iteration? I think you may be getting very
close to saturating PC600 memory throughput!
Regards
Brian Beesley
1775*2^332181+1 is prime! (100000 digits) Discovered 22-Apr-2001
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers