Alex Kruppa has done some SPARC timings of Mlucas 2.7x compiled using the
newish SPARC f90 v2 compiler, which appears to be much better than v1.
(At last, 64-bit loads and stores - hooray! Seems ridiculous for such
a thing to be worth cheering about, but like I said, v1 was *really* bad.)
He writes:

<< I've got a binary that needs 0.183 secs / iteration with the 256K array
size now. Seems the Fortran section of Sun Compilers borrowed a few smart
guys from the C section. :) Mlucas_2.7x on a UltraSparc II 300 Mhz is
almost as fast as on a 400 Mhz Alpha 21164. >>

Hi, Alex, and many thanks for the timings. That sounds promising - note that
the latest README file has a complete set of timing/accuracy tests for cases
from 64-4096K.

<< I tried all sorts of compiler flags - unfortunately, the optimization
flags are not linear, especially -O5 tends to produce much slower code
than -O4 when combined with other flags. >>

I see similar weird slowdowns using the -O5 compile option on some (not
all) Alpha CPUs (generally the older ones.) I wonder if both compilers
are doing similar "optimizations" at -O5.

<< I'm using -fast -libmil -xlibmopt -fnsyes now, which seems to give
close to optimal performance. >>

As long as it correctly runs the self-tests, faster is better. Note that
a few of the self-tests will give some roundoff warnings, especially if
the compiler in question doesn't support extended-precision floats, used
(when available) by Mlucas for sincos and DWT weights tables initializations.

<< I dont know whether this is also optimal on other types of UltraSparc, I
only have Ultra60s for testing. >>

Let me know where I can ftp the binary, and I suspect we'll soon get lots
more SPARC timings - it's a popular Unix platform.
 
Thanks,
Ernst

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to