[sage-devel] Re: SSE2 not so useless after all

Clement Pernet Thu, 22 May 2008 10:19:18 -0700

Hi,

> 
> Bill, I suppose that also means that now we actually beat (or are close to 
> beating) Magma on the C2D "for real". My M4RI times are quite similar on the 
> C2D as your times on your Opteron. But my version of Magma (on the C2D) is 
> much worse than your version of Magma (on the Opteron). So it is probably 
> best to assume at least your times for Magma on my machine too.
> 
That's awesome news!


> PS: I wonder if this argument makes sense:
> 
> We have a complexity of n^2.807 and a CPU of say 2.333 Ghz which can operate 
> on 64 bits per clock (128 if we use SSE2). So if we had optimal code (no 
> missed branch predictions, no caching issues, everything optimal) we would 
> expect a running time of    n^2.807 / 64 / 2.333 / 10^9

Don't forget the constants!
Strassen-winograd, is 6n^2.807.
Now this constant correspond to both mul and adds and I guess that your
 boolean word operation ^= computes a + and a * in 1 clock cycle, so I
don't really know the constant in this case (6/2=3 seems dubious to me).

Furthermore, as Bill pointed out, one really has to count the real
number of ops since only a few recursive calls are made.
Eventhough, this would mean that the expected optimal time would be
larger, and consequently, that you're could is closer to optimal!

> 
> If we plug in 20,000 for n we'd get 7.923 seconds w.o. SSE2 and 3.961 with 
> SSE2. So our implementation (12.2 s) is a factor of ~1.5 or ~3 away from 
> being optimal? Does that sound correct or complete bollocks?
> 
> 



--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

[sage-devel] Re: SSE2 not so useless after all

Reply via email to