Re: Mersenne: Re: Roundoff errors
- Original Message - From: "George Woltman" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Wednesday, July 24, 2002 3:01 AM Subject: Mersenne: Re: Roundoff errors > Daran mentioned that he was only doing P-1 factoring, > leaving the LL test for others. That was correct. I have just started a LL test on your advice with ro checking on. If it completes, I will return to doing P-1 only. I'm satisfied that there is no evidence of any problem not attributable to overheating. Regards A very cool running Daran G. _ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: Re: Roundoff errors
At 06:49 PM 7/23/2002 -0400, [EMAIL PROTECTED] wrote: > >The M12898349 error may be OK as it may have been very > >close to the limit of the 640K FFT (I don't recall the > >22.3 crossover points). > >According the the GIMPS status page, the 640K crossover >point for the more-accurate x86 client (i.e. the non- >SSE2 code) is 12.83M, Thanks to Brian's research we've revisited all the x87 crossovers. Some v22 executables had a 12.90M crossover. This has been lowered to 12.89M in v22.7. >Except that P-1 stage 1 (unless I'm mistaken about >Prime95's particular implementation thereof) uses the >same FFT-based squaring code that the LL test does. >it seems curious that Daran mentions seeing the roundoff >warnings in p-1 stage 1, and not in the subsequent LL >tests of these exponents. George, is error checking done >more frequently in p-1 than in LL? Daran mentioned that he was only doing P-1 factoring, leaving the LL test for others. The P-1 factoring code does error checking just as often as the LL code. _ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: Re: Roundoff errors
Daran wrote: >Twice, in the past three days I've had the message: >"Possible roundoff error (0.4xxx), backtracking to >last save file.", while performing P-1 stage one on >M13194707 and M12898349 respectively. George Woltman replied: >The M12898349 error may be OK as it may have been very >close to the limit of the 640K FFT (I don't recall the >22.3 crossover points). According the the GIMPS status page, the 640K crossover point for the more-accurate x86 client (i.e. the non- SSE2 code) is 12.83M, which is already less than the smaller of Daran's exponents - the SSE2 crossovers (irrespective of the precise code version) are going to be smaller still. Thus, neither of these exponents should be yielding close-to-crossover roundoff errors, since both should be running in the lower part of the 768K-FFT-length exponent range. > >Errors during P-1 are not particularly critical. Except that P-1 stage 1 (unless I'm mistaken about Prime95's particular implementation thereof) uses the same FFT-based squaring code that the LL test does. If there's a hardware problem or (say) a device driver that is corrupting data, it would seem to be just as likely to do so in either the p-1 step or the LL test. Still, it seems curious that Daran mentions seeing the roundoff warnings in p-1 stage 1, and not in the subsequent LL tests of these exponents. George, is error checking done more frequently in p-1 than in LL? Daran, were you doing p-1 ONLY on these exponents, or were these the first Prime95 runs you did on some new machines? (In which case the code perhaps hadn't gotten to the LL stage yet.) -Ernst
Mersenne: Re: Roundoff errors
I wrote: > Especially for large runlengths (and after the first few hundred iterations > or so), rounding errors tend to be randomly distributed in an approximately > Gaussian fashion, Brian Beesley wrote: > I know perfectly well what you mean, but these two statements tend to > contradict each other. Gaussian distributions are continuous & > smooth, we have instead a discrete distribution whose gaps tend to > increase with size. I added some code to the Mlucas 2.7a source (my personal copy, not the version on my website) to better-quantify this. Since fractional errors are defined in the interval [0, 0.5], I subdivided this interval into 64 equal-sized bins and counted how many fractional parts fell into each bin on each iteration. Then I ran around a hundred iterations of M1325011 using an FFT length of 64K, i.e. an exponent close to the practical roundoff limit for that FFT length on an IEEE64-compliant machine. Up to iteration 13, all fractional errors fell into bin 0, i.e. were in [0, 1/128). After that, things looked as follows - In the table below, the vertical placement corresponds to bins 0 through 64 (bin 64 corresponds to the special case of an error = 0.5), and the table entries denote the number of fractional errors in the corresponding bin: iteration bin 14 15 16 17 18 19 20 21 ... 44 --- - - - - - - - - ... - 0 65530 65457 64930 61600 44951 23245 96807781 7555 1 6 76 555 333215541 19547 10589 7371 7158 2 0 3 44 475 357311771 12090 1043310356 3 0 0 7 96 972 562076395971 5961 4 0 0 0 30 348 308985189169 9353 5 0 0 0 3 84 106543214033 4042 6 0 0 0 0 55 740 47515804 5841 7 0 0 0 0 5 164 19232462 2489 8 0 0 0 0 6 199 27084603 4730 9 0 0 0 0 0 32 743 1262 1290 100 0 0 0 1 48 11492203 2185 110 0 0 0 0 2 266 594 616 120 0 0 0 0 10 672 1803 1890 130 0 0 0 0 0 62 277 259 140 0 0 0 0 3 206 591 606 150 0 0 0 0 0 21 110 118 160 0 0 0 0 1 137 614 636 170 0 0 0 0 0 3 43 49 180 0 0 0 0 0 23 146 121 190 0 0 0 0 0 0 19 22 200 0 0 0 0 0 29 164 162 210 0 0 0 0 0 0 56 220 0 0 0 0 0 2 14 20 230 0 0 0 0 0 0 12 240 0 0 0 0 0 4 49 51 250 0 0 0 0 0 0 20 260 0 0 0 0 0 0 43 270 0 0 0 0 0 0 10 280 0 0 0 0 0 0 611 290 0 0 0 0 0 0 00 300 0 0 0 0 0 0 00 310 0 0 0 0 0 0 00 320 0 0 0 0 0 0 14 330 0 0 0 0 0 0 00 340 0 0 0 0 0 0 00 350 0 0 0 0 0 0 00 360 0 0 0 0 0 0 00 370 0 0 0 0 0 0 00 380 0 0 0 0 0 0 00 390 0 0 0 0 0 0 00 400 0 0 0 0 0 0 00 410 0 0 0 0 0 0 00 420 0 0 0 0 0 0 00 430 0 0 0 0 0 0 00 440 0 0 0 0 0 0 0
Re: Mersenne: Re: Roundoff errors
On 7 May 00, at 17:41, [EMAIL PROTECTED] wrote: > A large significant (non-error) > part of the convolution coefficient means that any accumulated rounding > errors will collect in the least-significant few bits of the floating-point > mantissa. That's why errors close to 0.5 tend to come in the form of > (integer)/(small power of 2). > > b) Especially for large runlengths (and after the first few hundred iterations > or so), rounding errors tend to be randomly distributed in an approximately > Gaussian fashion, I know perfectly well what you mean, but these two statements tend to contradict each other. Gaussian distributions are continuous & smooth, we have instead a discrete distribution whose gaps tend to increase with size. If we have a mechanism which tends to "chop down" the result of a floating-point operation towards zero when the FPU register isn't accurate enough to contain it all, and we're down to 4 guard bits, then an actual rounding error of 0.49 could be represented as 0.4375, i.e. there is almost no safety at all between 0.4375 and 0.5. If we _ever_ see a value between 0.4375 and 0.5, we must have more than 4 guard bits, and the logic of your argument is that we are probably OK if we see 0.4375's, but not 0.46875's. We should probably find out how much safety we have - i.e. can we provoke a 0.46875, or even a 0.484375. As for fixing the problem (when we are in a position to make a rational analysis of what constitutes a problem!) there would seem to be an automatic recovery method, as follows: 1) Unpack the work vector for the previous iteration & recover the true residual. 2) Generate a new work vector for the next FFT run length & pack the residual into it. 3) Run one iteration - i.e. re-run the iteration that caused the excess roundoff panic. 4) Unpack the work vector & recover the true residual after running this single iteration. 5) Generate a new work vector for the original FFT run length & pack the residual into it. 6) Continue as if nothing had happened. This will keep the speed high except for a few isolated iterations. The argument here is that running the whole test with a larger FFT run length is computationally expensive; it's very likely to be unneccessary (unless the FFT run length breakpoint is grossly incorrect - in which case a run will contain a great many "recoveries"); finally, even if an excess rounding error does slip through & cause an incorrect result, double-checking with a different program will pick it up. Prime95 is a bit different: The extra 11 bits of mantissa in the Intel FPU registers cause the roundoff errors to be apparently much more smoothly distributed, so, even if we do see the odd 0.45..., we _might_ still be OK. Prime95 working in its default mode only checks convolution error every 128 iterations, so there's _some_ chance that things do, in fact, slip through. The safeguard here is that repeating the run with a different offset (as in a double-check run) will give a different distribution of roundoff errors, and (if the calculation went wrong as a result of an excess rounding error) it will not go wrong at the same place again - so the final residual will be different. Regards Brian Beesley _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: Re: Roundoff errors
Richard Otter wrote: {snip} > M10068757 Roundoff warning on iteration 9803267 maxerr = 0.4375 > M10068757 is not prime. Res64: AEED24F91193EE6F. Program: E2.7z Your result is in all likelihood fine - I've done double-checks that gave lots more roundoff errors of the 0.4375 variety and still gave the correct result. Yes, 0.4375 is quite close to the fatal 0.5 level, but the reason it's highly unlikely one of those 0.4375's was really a 0.5625 which was aliased (by the error detection algorithm) to a 0.4375 is as follows: a) large errors like the above are the result of large convolution coefficients during the FFT-based sqauring. A large significant (non-error) part of the convolution coefficient means that any accumulated rounding errors will collect in the least-significant few bits of the floating-point mantissa. That's why errors close to 0.5 tend to come in the form of (integer)/(small power of 2). b) Especially for large runlengths (and after the first few hundred iterations or so), rounding errors tend to be randomly distributed in an approximately Gaussian fashion, and the distribution is essentially identical from one iteration to the next. That means that if one of those 0.4375's was really a 0.5625 to a 0.4375, you'd very likely have seen a 0.5000 (which would have caused the test of that exponent to be halted) at some point. Statistically, for a run like yours where errors above 0.4 are few, an error of 0.5625 is going to be rather rarer than a 0.5, i.e. you'd be quite unlikely to get an error of 0.5625 (which would evade detection) but not see any 0.5000's (which would not.) > I am running Mlucas_2.7z.ev4 on a Compaq Alpha machine. For you to see any roundoff errors for p ~ 10.07M seems unusual (the other such cases I've seen are all at 10.08 or above), so I'm guessing that you're running under DEC Unix 4.0. In 4.0, there's a bug in the real*16 trig library which Mlucas uses for sincos tables - for angular arguments of the form theta = n*pi/2 +- pi/512, the sin and cos values can be incorrect. I first encountered this 1 1/2 years ago and built in a workaround for it - the code compares the real*16 values and real*8 and if the difference exceeds an error threshold, uses the latter. This costs a small amount of accuracy, but when you're really close to the default exponent limit that's all it takes. Also, I suggest you upgrade to v2.7a at your earliest convenience; this will allow the program to continue on to the next exponent in the worktodo.ini file if the current run quits due to a fatal error (see below). 2.7a also supports runlengths of 288K and 576K (whereas 2.7z jumps to 320K and 640K, respectively), which means less of a timing jump when your exponents cross the 5.15M and 10.11M thresholds. * * * * WHAT TO DO IF YOU GET A FATAL ERROR MESSAGE: Several users have reported encountering fatal convolution errors while testing exponents close to the default limits for FFT lengths 256K and 512K. If this happens using v2.7z, the program will simply quit. Using v2.7a, the program will simply halt the test of the current exponent and move on to the next one in the worktodo.ini file. In either case, the previous-checkpoint savefiles for the halted exponent will still be there, and I have made available a higher-accuracy (and slightly slower) version of the 2.7a code which should enable you to safely finish these runs. Check out ftp://209.133.33.182/pub/mayer/README.html Sorry about any inconvenience this may cause - as more people use the program, I'll be able to refine the runlength/exponent breakpoints based on observed error behavior. Cheers, -Ernst _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers