On 14 May 2001, at 20:52, George Woltman wrote:
>
> >Is the self-test in fact just to check
> >that there's not something in the CPU which goes glitchy when running
> >flat-out SSE2 code for hours on end?
>
> Yes. The QA suite that Ken Kriesel and Brian Beesley worked on does a
> better job at testing edge conditions. Of course, they'll need to update that
> suite using the new limits.
>
Presumably FPU code on processors other than the P4. But yes, it is
and (to the best of my knowledge) always has been a hardware test
rather than a test of the software, at any rate once the software is
in the hands of users.
Actually most of the exponents in the test suite were chosen to
exercise code in a manner which is particularly hard on the "magic
numbers" involved in the collapsed DWT. I'm sure we can find some
more exponents & extend the test suite so that more candidates are
available in the appropriate range for each FFT run length. Probably
this was getting overdue anyway, given the increasing performance of
available processors.
Perhaps it would help us if George could indicate the approximate
limits for each run length when the SSE2 code is in use.
But could I just point out that there is a _potential_ benefit in
running selftests using ridiculously small exponents for the run
length being tested. Normally the maximum permitted roundoff error is
0.4; this means that a roundoff error will only be detected as such
on one in every five occasions on which it occurs. If we use a small
exponent then we could reduce the roundoff error limit to 0.1 (or
maybe even less) and therefore detect a much larger proportion of any
roundoff errors which might occur. The fact that the residual checked
at the end of each self-test may still be correct does not prove that
a hardware glitch has not occurred, though gross errors will of
course cause the selftest to fail for this reason.
If this idea is developed, it's important to be aware that using too
small an exponent for a particular run length can invalidate the
"subtract two" coding in the LL test. I think the lower limit is
three bits per element, though this may depend to some extent on the
exact way in which the code is implemented.
Regards
Brian Beesley
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers