Hi all,
A new Glucas v.2.8c version has been released. You can download the source
code or precompiled binaries at sourceforge site:
http://sourceforge.net/projects/glucas
and, as slow as a snail alternative at my anonymous ftp server:
ftp://ftp.oxixares.com/pub/glucas/Glucas-2.8c
The main features of this release are in the ChangeLog excerpt :
- Most of the FFT code has been written specially for IA64 (anyway
standard C code) and the use of Intel compiler has doubled the speed
for that platform. Now is the fastest Glucas runner (per clock).
The improvement is not only because of a rewritten code, a new preload
scheme has been introduced. Basically it loads in a loop cycle what we
need in the next one. It is only efficient in processors with lot of
registers as Itanium's (128 FP registers). To active this code in
IA64 architecture compile with -DY_ITANIUM
-Klaus Kastens has improved the timing routines. The elapsed real
time now is measured with a millisecond precision.
-Klaus also has written a new output format for verbose information.
Now there is information about the percentage of processor time spent
by Glucas.
-Some small changes for other platforms. No significant performance
improvements.
-Fixed a rare memory bug affecting to huge selftest under OpenBSD
(reported by Gregory Matus). Better memory management.
-Fixed signal code for FreeBSD/x86 and Mac OS X.
This release has been possible because of the help of Klaus Kastens (new
developer of Glucas), and the superb team of beta testers (in alphabetical
order): Brian J. Beesley, Tom Cage, Ludovic Ferrandis, Gregory Matus and
Thomas Perrier.
The increase in performance is moderate for most of platforms (0% to %5). For
Itanium (the STAR of the release) this version is almost twice faster than
2.8b. Here is the timings for an Itanium @ 800 Mhz (Compaq Blazer Itanium):
*****************************************************************************
These are timings from Glucas-2.8c. (sec/iter). Roundoff check on/off.
Itanium: Intel C compiler v.5.0 beta
COMPILER FLAGS:
-O3
GLUCAS FLAGS:
-DY_AVAL=3 -DY_MEM_THRESHOLD=32768 -DY_BLOCKSIZE=4096 -DY_SHIFT=9
-DY_TARGET=0 -DY_ITANIUM
(1)
Blazer Itanium
4 @800
RedHat Linux
Kernel 2.4.3
128 K 0.011/0.010
144 K 0.015/0.013
160 K 0.015/0.014
192 K 0.018/0.016
224 K 0.021/0.019
256 K 0.023/0.022
288 K 0.031/0.028
320 K 0.031/0.029
384 K 0.037/0.035
448 K 0.043/0.042
512 K 0.049/0.047
576 K 0.065/0.059
640 K 0.082/0.080
768 K 0.100/0.095
896 K 0.117/0.114
1024 K 0.134/0.130
1152 K 0.166/0.155
1280 K 0.187/0.181
1536 K 0.223/0.215
1792 K 0.260/0.253
2048 K 0.295/0.288
2304 K 0.363/0.341
2560 K 0.378/0.367
3072 K 0.447/0.431
3584 K 0.523/0.508
4096 K 0.588/0.574
********************************************************
At the server, there is two IA64 prebuild binaries. I would recommend the
statically linked one. It is build with an older beta compiler version and is
faster (about %5).
There is a lot of good and fast binaries build for powerpc family
Linux/MacOS/MacOSX/Darwin. It is remarkable the good timings for G4
processors (only 10% slower than prime95/pentiumIII).
We still have no made a good timing page, we will send it to E.Mayer and to
sourceforge when possible.
Have a good weekend.
Guillermo.
--
Guillermo Ballester Valor
[EMAIL PROTECTED]
Granada (Spain)
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers