> >Yeah, but most of those are "silent mutations" in nonzero residues, not
> >errors in the purported primality result, right?
> 
> Right.  The odds heavily favor both mismatched residues being nonzero.
> A zero residue at the last iteration is what indicates primality.

Depends on what you mean. If a Mersenne number that has been 
tested once really is prime, but the test "went wrong", then we 
have a wrong result i.e. calling the number composite when it isn't.

This is why double-checking is so important, if we want to find *all* 
the Mersenne primes for exponents up to a given limit.

There *might* be one or two lurking somewhere in the mass of 
exponents which have only been tested once.
> 
> >Also what causes the errors, bugs in the code? 
> 
> What I've seen most often is that prime95 and its relatives provide
> early warning of unreliable hardware, whether cpu, RAM module, or motherboard.

Usually caused by overheating - failed CPU fan, poor ventilation or 
excessively hot environment, overclocking, poor thermal contact 
between processor substrate and heatsink ... 
> 
> >Is work being done on
> >finding subtle errors in the software? 
> 
[... snip ...]
> I've volunteered to run on Intel, one or two exponents in each run length
> to try out the code before the bulk of the GIMPS effort is routinely being
> assigned exponents in the higher run lengths.  Perhaps someone running a
> different architecture would be willing to double check them.

Ken, could you tell me which exponents you've run, which require 
checking, & I'll double-check using MacLucasUNIX on a 533 MHz 
Alpha system (approx. same speed as PII-300 running Prime95)
I would have thought one exponent per run length would be enough, 
provided our results agree. My Alpha system has ECC memory 
etc. so should be reasonably reliable.

[... snip ...]
> >Are some of them, in the case of
> >Prime95, caused by Winblows? 
> 
> Iteration: 1407235/5070277, ERROR: ILLEGAL SUMOUT
> Possible hardware failure, consult the readme file.
> Continuing from last save file.
> 
> Possibly these are software, according to George's readme.txt, under the 
> section Possible Hardware Failure
> If it is software, it is not necessarily the fault of Windows or Microsoft.
> Could be a bum driver not doing things it should.

There's a distinct *possibility* that *any* software running under 
Win 9x could directly alter values in Prime95's workspace, since 
Win 9x applications have access to the whole physical address 
space. In Win NT (and linux), only kernel mode tasks can do this, 
so the likelihood of memory being clobbered by a rogue application 
is a great deal less.
> 
> >(IOW, are there more errors per P90 CPU hour
> >among Winblows boxes than among mprime boxes?)

Don't know. It may be possible to get this info from George's 
database, but it would take a fair bit of digging out.

> >I figure only a small percentage of participants have actual faulty
> >hardware, and that spurious cosmic ray bit flips are caught by checksumming
> >of some kind.

I think you'll find it's surprising how many systems become a great 
deal more reliable if (a) the PSU is adjusted so that the supply rails 
are accurate (+/- 5% errors are common), (b) the cooling is 
improved (even turning down the room thermostat by a couple of 
degrees can make a difference), (c) the system is clocked just a 
few percent slower (especially if it has been overclocked to start 
with).

Lots of users put up with flaky hardware; they get used to Windoze 
locking up once in a while & just blame Bill Gates. He's not 
*always* the guilty party.

Also, in my fairly extensive experience, systems that have been 
well-handled from a electrostatic point of view tend to be reliable, 
whereas those where people have changed memory etc. without 
observing anti-static precautions tend to be flaky.

Regards
Brian Beesley
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Reply via email to