To anyone running Mlucas on one of the new SunBlade
workstations:

Bill Rea writes:
<<
I've had a call from Sun to say that SunBlades manufactured before
March 14 have a "bug" in the prefetch pipeline which, under very
rare circumstances, can give incorrect floating point results.
They have produced a "patch" which turns off the prefetch pipeline.
I've switched over to using the non-prefetch Mlucas version. The notice
from Sun did not include a non-disclosure clause so I assume they
are willing for this information to be made known publicly. Certainly
any gimpsters should switch to the non-prefetch Mlucas until the
patch is installed on their systems. After that they should continue
to use the non-prefetch version as the prefetching is disabled.

I've reported a couple of results and have four more which could
potentially be affected by the bug. Maybe George would like to
schedule early double checks on these as if we wait for the double 
checking to reach them that could be a few years away. 
>>

Note that both a prefetch-enabled and non-prefetch
binary are included in the Mlucas 2.7b tarball, so
you simply need to switch to the non-prefetch binary
which you already have. You'll probably see a bit of a
performance hit, which is quite annoying.

Bill, could you send George the exponents you ran (at
lest partway) using prefetch on your Sunblade, so he
can schedule double-checks ASAP? If we get statistics
on a decent number of them, perhaps we can better
decide whether the risk of running into the prefetch
bug is worth the increased throughput of running with
prefetch enabled.

On an unrelated issue, Bill writes:

<<
Some of these are giving roundoff errors as we approach the limit
of what 640K FFT will handle, I hope they go to completion.
Here's one set of roundoff warnings so far:-

M12542347 Roundoff warning on iteration  188766 maxerr =  0.406250000000
M12542347 Roundoff warning on iteration  288378 maxerr =  0.437500000000
M12542347 Roundoff warning on iteration  613044 maxerr =  0.406250000000
M12542347 Roundoff warning on iteration  722136 maxerr =  0.406250000000
M12542347 Roundoff warning on iteration 1106225 maxerr =  0.406250000000
M12542347 Roundoff warning on iteration 1111869 maxerr =  0.406250000000
M12542347 Roundoff warning on iteration 1190077 maxerr =  0.406250000000
>>

Errors of 0.40625 should be fine, assuming you're not
getting oodles of them. 0.4375 is a bit more worrisome,
but if you only get a few you should be OK, at least
based on my past experience with the program. For you to
get an error of .4375 which is really 0.6625 aliased to
0.4375 by the NINT function, you'd also expect to see
lots of intermediate errors of 0.500, assuming errors
are distributed reasonably smoothly (which is of course
also questionable in the discrete world.) If a 0.500
error is detected, the program will halt that run and
proceed to the next exponent in the worktodo.ini file,
but your savefiles from the aborted run will still be
OK. In that event, you should (at your earliest
convenience) halt the program, restore the exponent of
the aborted run to the top of the .ini file, and switch
to a different radix set index in the mlucas.cfg file
in an attempt to get past the troublesome iteration.
(If that succeeds, you can then switch the .cfg file 
back to the earlier radix set, which presumably was 
giving the best runtimes on your machine.)

It's a bit of a manual kludge, but hopefully won't
occur very often. (And if the kludge seems to be an
effective one, I'll probably have the code do it
automatically in the next release.)

-Ernst

_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to