Hello folks,

by accident I build my 3.4.2.alpha0 build as an SSE2 only build. So I
had a change to play with it a little and check for performance
regressions. Here are some basic benchmarks:

SSE2 vs. SSE3:

 * measurable difference for ZZ determinant (~10% slower with SSE2 for
300x300, 400x400, etc)
 * huge difference (factor *4* slower with SSE2) for RDF matrix matrix
multiply (2000x2000, 3000x3000, 4000x4000)
 * tiniest difference for ZZ matrix matrix mutiply (~1-2% *faster
*with* SSE2 over SSE3)

Dan Drake reported in IRC that some of the doctests in the matrix
directory ran 7% faster with SSE2 instead of SSE3. This rather
perplexing result might be due to the SSE2 only Hammer ATLAS being
significantly smaller in footprint in the cache (it certainly contains
way fewer SSE instructions) and that this results in better cache
locality and hence faster code for the matrix directory doctests.
Overall these are some interesting developments. While the RDF matrix
matrix multiplies did not surprise me one bit the small slowdown or
tiny speedup for operations over ZZ (where some code uses multi
modular arithmetic and hence ATLAS) is a little puzzling.

Thoughts?

Cheers,

Michael
--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to 
sage-devel-unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

Reply via email to