On Saturday 25 May 2002 22:19, you wrote: > I noticed that v22.2 and v22.3 automatically do roundoff checking every > iteration for any exponent close enough to the FFT limit. Is there any > reason to be concerned about the possibility of roundoff error for CPUs > that aren't P4s?
I don't think so. We are looking at the x87 (non-SSE2) code and may make some minor adjustments to the FFT run length crossover points, but there is a lot of "experimental evidence" relating to non-SSE2 code; the adjustments are probably as likely to be up as down. Please remember that the crossover points are a compromise between wasting time by using an excessive FFT run length and wasting time due to runs failing (or needing extra checking) due to using a FFT run length which is really too short. There is no completely safe figure. > What about if the non-P4s are only doing double checks? This doesn't really matter. Double checks are independent of the first test. Don't assume that the first test was correct... if you make that assumption, what's the point in running a double-check at all? > Since numbers of double checking size have been checked by non-P4s for > years without any problems that I've heard about. The point is, if you do get an excess roundoff error that makes the run go bad, the double-check (when it is eventually done) will fail, and the exponent will have to be tested again. There is essentially no possibility of the project missing a prime as a consequence of this. However, if you can detect the likelihood of there being excess roundoff errors at the time they're occurring, you can save time which would be wasted if you continue a run which has already gone wrong. This also virtually eliminates the possibility of you, personally, missing a prime due to a crossover being too aggressive and therefore falling victim to an undetected excess roundoff error. We simply don't know if there are extra problems occurring very close to the existing non-SSE2 crossover points as any "genuine" errors caused by the crossover points being too aggressive are overwhelmed by errors caused by "random" hardware/software glitches. However it has become apparent that the SSE2 crossover points were initially set too aggressively. We do have one documented instance of where a roundoff error of 0.59375 occurred (aliased to 0.40625, therefore causing a run to go bad) without there being any other instances of roundoff errors between 0.40625 & 0.5. This is probably a very, very rare event, but the fact that it has happened at all has made us more wary. v22.3 has a new error checking method which will _correct_ any run which is going wrong by running the iteration where the excess roundoff error occurs in a slow but safe mode. This of course depends on the excess roundoff error being detected. If you have roundoff error checking disabled then you miss the chance 127 times out of 128. The roundoff error rises very rapidly with the exponent size - somewhere round about the 25th power. This is why it's only worthwhile having roundoff error checking every iteration in the top 0.5% or so of the exponent range for any particular run length - that 0.5% makes a lot more than 10% difference to the expected maximum roundoff error. Why not just set the crossovers lower? Well, this would work, but running with roundoff checking enabled is faster than running with the next bigger FFT run length but without roundoff checking. Another consequence of having roundoff error checking enabled is that random hardware glitches (or software glitches due to misbehaviour by device drivers etc. unrelated to Prime95) will be detected much more consistently. > Very specifically, I'm > wondering if I should be ok if I use the "undocumented" setting in > prime.ini to turn off roundoff checking every iteration for when my Pentium > 200 MHz double checks 6502049 ( the next FFT size is at 6520000 ). Thanks. Up to you. My feeling is that the new default behaviour is right. However per-iteration roundoff checking probably causes more of a performance hit on Pentium architecture than on PPro or P4 due to the relative shortage of registers. Another point here, if people using v22.3+ leave the default behaviour, we will get a lot better evidence as to the actual behaviour in the critical region just below the run length crossovers; we will be able to feed this back in the form of revised crossovers and/or auto roundoff error check range limit. QA work should prevent gross errors, but the amount of data which QA volunteers can process is small compared to the total throughput of the project. We should have avoided the problems with the aggressive SSE2 crossovers, but QA volunteers didn't have P4 systems at the time the code was introduced. Regards Brian Beesley _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
