Mersenne Digest Friday, September 17 1999 Volume 01 : Number 627 ---------------------------------------------------------------------- Date: Thu, 16 Sep 1999 11:47:48 +1200 (NZST) From: Bill Rea <[EMAIL PROTECTED]> Subject: Re: Mersenne: SPARC times Laurent, > From [EMAIL PROTECTED] Wed Sep 15 20:37:35 1999 > > Bill Rea wrote: > > > > This is using MacLucasUNIX compiled with the Sun workshop compilers. > > Which version and which flags did you use? I guess you ran > your tests under Solaris 7, right? The MacLucasUNIX was v 6.20. The Ultra 5 and 10 were Solaris 7, both E450's were Solaris 2.6. I tried both workshop 4 and workshop 5 compilers. The options in the make file which gave the fastest times on the tests were:- OPT=-fast -xO4 -xtarget=ultra -xarch=v8plusa -xsafe=mem -xdepend \ - -xparallel -xchip=ultra > That's very strange! I have benchmarked some code using both > flags (with -fast preprended) under Solaris 7 (which is required > to run code compiled with -xarch=v9) and v9 helped; however the > code was purely 64-bit integer. > > There's also a very interesting flag to test that's not > documented in Sun cc doc: -xinline=all. I used it by error but > it did a great job with the code I was working on. I'll try this option and also put the -xarch=v9 to a test on a new exponent when the next one finishes. The implication in the WS4 documentation is that the -xinline=all is automatically used with -xO4, but it's worth trying anyway. > I don't know the tests supplied but the difference might result > from the way time is counted. I think the best way to check the > speed of a code is to use getrusage for the process only > (RUSAGE_SELF) and to only take into account the user time. This > way I get very consistent timings for the before mentioned code > (BTW, the code is ecdl by Robert Harley, used to crack ECC). The time reports clock time, user time, and system time. I was comparing user time. I talked very briefly to a Sun Engineer and he said it probably had more to do with keeping the processor fed. Sometimes optimizations in the code don't translate into faster speeds because the limiting factor is memory access times. The CPUs with bigger caches perform better than you would expect given their clock speeds. However, he was just as surprized as I was that the v9 option didn't produce faster code. Bill Rea, Information Technology Services, University of Canterbury \_ E-Mail b dot rea at its dot canterbury dot ac dot nz </ New Phone 64-3-364-2331, Fax 64-3-364-2332 /) Zealand Unix Systems Administrator (/' _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Wed, 15 Sep 1999 20:13:02 -0500 From: Ken Kriesel <[EMAIL PROTECTED]> Subject: Re: Mersenne: Iters between screen outputs At 10:29 PM 1999/09/15 +0200, "Shot" <[EMAIL PROTECTED]> wrote: >Hi all, > >I was wondering why the "Iterations between screen outputs" setting >was defaultly set on 100, and I thought it was beacuse each screen >output takes precious CPU time. > >But when I changed it from 100 i/o (usually around 0.430 sec/iter) to >10 i/o, nothing really slowed down - the time was still between 0.430 >and 0.431 s/i. > >So I boldly went where no man has gone before ;) and changed it to 1 >i/o... and the times still stayed between 0.429 and 0.431 s/i. > >The question is, how much of the CPU's power is consumed by screen >outputs? Hardly anything these days, with fast pentiums and smart video cards. But on 486's & 386's it was significant. I have systems where it's set to 5 iterations, and systems where it's set at 1000 iterations. It all depends on system speed, exponent size, and desired update frequency. Ken Ken Kriesel, PE <[EMAIL PROTECTED]> _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Wed, 15 Sep 1999 22:23:58 -0400 From: Jeff Woods <[EMAIL PROTECTED]> Subject: Re: Mersenne: Iters between screen outputs The actual ITERATION takes the same, no matter how long the screen output takes. The report on the screen is NOT time for iteration PLUS time to display the report... It's time for the iteration WITHOUT the screen display. At 10:29 PM 9/15/99 +0200, you wrote: >Hi all, > >I was wondering why the "Iterations between screen outputs" setting >was defaultly set on 100, and I thought it was beacuse each screen >output takes precious CPU time. > >But when I changed it from 100 i/o (usually around 0.430 sec/iter) to >10 i/o, nothing really slowed down - the time was still between 0.430 >and 0.431 s/i. > >So I boldly went where no man has gone before ;) and changed it to 1 >i/o... and the times still stayed between 0.429 and 0.431 s/i. > >The question is, how much of the CPU's power is consumed by screen >outputs? > >If it is really around 0.001 s/i (in 0.430 s/i neighbourhood), I'll >leave mine at 10 i/o, so I can see it's alive. ;) > >Thanks for your time, >-- Shot > __ > c"? Shot - [EMAIL PROTECTED] hobbies: Star Wars, Pterry, GIMPS, ASCII > `-' [EMAIL PROTECTED] join the GIMPS @ http://www.mersenne.org > Science Explained (by Kids): Clouds just keep circling the earth > around and around. And around. There is not much else to do. >_________________________________________________________________ >Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm >Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 04:38:00 +0200 (CEST) From: Henrik Olsen <[EMAIL PROTECTED]> Subject: Re: Mersenne: complaint On Wed, 15 Sep 1999, Chris Jefferson wrote: > > On 14 Sep 99, at 7:22, Henrik Olsen wrote: > > > > > Personally I found installing and running it to be an extremely smooth > > > operation, to the point where it takes less that 2 minutes to install on a > > > new machine, including configuration, after which I just forget about it. > > > > Yes - I found the same thing - applies to both un*x & windoze > > clients. > > Although I don't mean to insult anyone, this is exactly what I mean. I can > install the program on a computer in 3 or 4 minutes. The problem is those > people who can't... Also just one question. Was is it I sometimes see > people write un*x instead of unix, which is I assume what they mean? Un*x or the alternative *nix are used because Unix is a registered trademark, originally by AT&T I belive, for a very specific OS. Most people when talking about un*x are talking about a group of OS's characterised by having (more or less) the same libraries, tools, security model and directory structure, which means the look and feel is the same, but the innards can be completely different, and it's only one of them (modulo licensing agreements and other weirdness) that can be called Unix. I hope this clears it up a bit. - -- Henrik Olsen, Dawn Solutions I/S URL=http://www.iaeste.dk/~henrik/ Thomas Covenant: I am the savior of The Land. Linden Avery: Can I help? Thomas Covenant: Over my dead body.(dies) (Linden Avery saves The Land.) The Second Chronicles of Thomas Covenant, Book-A-Minute version _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 01:14:03 EDT From: [EMAIL PROTECTED] Subject: Mersenne: Apple? Ew. <<I go to my mailbox and what do I find?>> S. "There are lies, damned lies, and benchmarks" L. _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 13:06:04 +0200 From: Harald Tveit Alvestrand <[EMAIL PROTECTED]> Subject: Re: Mersenne: complaint At 04:38 16.09.99 +0200, Henrik Olsen wrote: >Un*x or the alternative *nix are used because Unix is a registered >trademark, originally by AT&T I belive, for a very specific OS. > >Most people when talking about un*x are talking about a group of OS's >characterised by having (more or less) the same libraries, tools, security >model and directory structure, which means the look and feel is the same, >but the innards can be completely different, and it's only one of them >(modulo licensing agreements and other weirdness) that can be called Unix. Currently I believe the UNIX trademark is administered by The Open Group, which will allow anyone to call their system "Unix" if it passes their (quite expensive) Unix certification test, which tests conformance to their published specifications. (Yes, the specs are freely downloadable - surprise!) One version of Linux has paid the bill and passed the test, so at least one version of Linux is Unix. Go figure.... Harald - -- Harald Tveit Alvestrand, Maxware, Norway [EMAIL PROTECTED] _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 11:28:38 -0400 From: "St. Dee" <[EMAIL PROTECTED]> Subject: Mersenne: Celerons vs. Pentium II/III at large FFT lengths? Does anyone else notice that their Celeron based machines seem to take a relatively bigger performance hit when moving from testing exponents in the 384K FFT size to the 448K FFT size (under V18.1, at least)? I have a couple of non-overclocked Celeron 400 machines and, at the 384K FFT size, they report timings nearly identical to those George sent out in his email message of a couple of days ago, timings generated by his PII-400. However, when the machines test exponents with the 448K FFT size, they are more than 20% slower! Is the Celeron's relatively small L2 cache finally causing it to lose ground to the PII/PIIIs as the FFT sizes get larger and larger? Kel _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 11:46:38 -0400 From: "St. Dee" <[EMAIL PROTECTED]> Subject: Re: Mersenne: Celerons vs. Pentium II/III at large FFT lengths? At 11:28 AM 9/16/1999 -0400, St. Dee wrote: >Does anyone else notice that their Celeron based machines seem to take a >relatively bigger performance hit when moving from testing exponents in the >384K FFT size to the 448K FFT size (under V18.1, at least)? > >I have a couple of non-overclocked Celeron 400 machines and, at the 384K >FFT size, they report timings nearly identical to those George sent out in >his email message of a couple of days ago, timings generated by his >PII-400. However, when the machines test exponents with the 448K FFT size, >they are more than 20% slower! Is the Celeron's relatively small L2 cache >finally causing it to lose ground to the PII/PIIIs as the FFT sizes get >larger and larger? Accccckkkkkk! That was rather inexact wording. Let me try again...my Celeron 400 based systems crunch exponents in the 384K FFT range at about the same speed as George's PII-400 machine. However, at the 448K FFT size, George's machine appears to be 20% or more faster than my Celeron 400s. Could the 128K L2 cache of the Celeron chips (vs. the 512K L2 cache of the PIIs) be the culprit? Thanks, Kel _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 17:12:29 +0100 From: Nick Craig-Wood <[EMAIL PROTECTED]> Subject: Re: Mersenne: complaint On Wed, Sep 15, 1999 at 07:42:53PM +0100, Brian J. Beesley wrote: > I'm going to be _very_ interested in how many people choose to run > $100,000 prize candidates using v19. There is an obvious balance > between the time it takes to complete a test & the enthusiasm of > people to participate, even if there is a substantial cash prize > riding on it. If one had the data then it would be reasonably easy to calculate the expectation (in $ per year) for a given exponent size & CPU. You'd need to know the iteration time for any given exponent for that CPU and the probably distribution of mersenne primes at that exponent size. Unfortunately I don't know these otherwise I would have worked it out ;-) (Digression: Though expectation is a useful statistical measure - in real life where there are very small probablilities involved multiplied by very large amounts the results are not useful practically. Eg the UK lottery - for your 1 pound stake your expectation is 50p. Better to go to the bookies where for your pound you get an expectation of 95p...) - -- Nick Craig-Wood [EMAIL PROTECTED] http://www.axis.demon.co.uk/ _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 19:02:08 +0100 From: "Brian J. Beesley" <[EMAIL PROTECTED]> Subject: Mersenne: un*x "un*x", like "windoze", is a joke. It's pronounced "eunuchs" and derives, so far as I know, from a saying that "real men use Multics". If you're under 35, you're not likely to recognise the reference to a long-obsolete (but advanced, for its day) operating system, used on Honeywell mainframes, which provided the conceptual model for the development of Unix - a "cut down" operating system originally designed to run on early 70's minicomputers, which were (at best) about the same power as the _original_ IBM PC. The point of the joke is lost if "*nix" is used instead. Regards Brian Beesley _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 23:44:49 +0200 From: "Steinar H. Gunderson" <[EMAIL PROTECTED]> Subject: Mersenne: Re: complaint On Thu, Sep 16, 1999 at 01:06:04PM +0200, Harald Tveit Alvestrand wrote: >One version of Linux has paid the bill and passed the test, so at least one >version of Linux is Unix. If you wanted to be picky, you could always say that a version of _GNU_/Linux has passed the test... The tests aren't for the kernel only, are they? OK, I'm picky, I'm picky... /* Steinar */ - -- Homepage: http://members.xoom.com/sneeze/ _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 18:23:34 -0400 From: Pierre Abbat <[EMAIL PROTECTED]> Subject: Re: Mersenne: Re: complaint On Thu, 16 Sep 1999, Steinar H. Gunderson wrote: >On Thu, Sep 16, 1999 at 01:06:04PM +0200, Harald Tveit Alvestrand wrote: >>One version of Linux has paid the bill and passed the test, so at least one >>version of Linux is Unix. > >If you wanted to be picky, you could always say that a version of _GNU_/Linux >has passed the test... The tests aren't for the kernel only, are they? But "GNU" stands for "GNU's Not Unix!", so how can it be Unix if it isn't? phma _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 18:35:32 -0400 (EDT) From: Lucas Wiman <[EMAIL PROTECTED]> Subject: Re: Mersenne: Celerons vs. Pentium II/III at large FFT lengths? > Accccckkkkkk! That was rather inexact wording. Let me try again...my > Celeron 400 based systems crunch exponents in the 384K FFT range at about > the same speed as George's PII-400 machine. However, at the 448K FFT size, > George's machine appears to be 20% or more faster than my Celeron 400s. > Could the 128K L2 cache of the Celeron chips (vs. the 512K L2 cache of the > PIIs) be the culprit? I think so. Yves Gallot mentioned something similar on the Primes-L list a while ago. This brings us to an interesting point. Should the primenet server start default assigning celeron's <384K FFT mersennes, and save the larger ones for PII's/PIII's? - -Lucas _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 22:20:30 -0400 (EDT) From: Lucas Wiman <[EMAIL PROTECTED]> Subject: Re: Mersenne: Re: complaint > >>One version of Linux has paid the bill and passed the test, so at least one > >>version of Linux is Unix. > > > >If you wanted to be picky, you could always say that a version of _GNU_/Linux > >has passed the test... The tests aren't for the kernel only, are they? > > But "GNU" stands for "GNU's Not Unix!", so how can it be Unix if it isn't? Wow! Could this be used for some kind of proof of the logical inconsistancy of the GNU system? What if M$ has been right all along? Scary, mind-blowing stuff. - -Lucas _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Thu, 16 Sep 1999 21:01:20 -0700 From: Spike Jones <[EMAIL PROTECTED]> Subject: Re: Mersenne: complaint Nick Craig-Wood wrote: If one had the data then it would be reasonably easy to calculate the expectation (in $ per year) for a given exponent size & CPU. Nick, I worked this out a few months ago when the prizes were announced. Using a pentium II/400 the mathematical expectation for those hunting the 10 million digit prize comes out to about 18 cents per year, as I recall. Dont quit your day job... {8-] spike _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Fri, 17 Sep 1999 11:36:18 +0100 From: "Brian J. Beesley" <[EMAIL PROTECTED]> Subject: Re: Mersenne: Celerons vs. Pentium II/III at large FFT lengths? On 16 Sep 99, at 18:35, Lucas Wiman wrote: > > This brings us to an interesting point. Should the primenet server start > default assigning celeron's <384K FFT mersennes, and save the larger ones > for PII's/PIII's? No. Whatever the problem was (I _did_ manage to duplicate it on my Celeron 366 laptop using v18 - a 448K FFT was actually _slower_ than a 512K FFT on the same system!) v19 does not suffer from it. And the PrimeNet server doesn't recognise a Celeron unless v19 is running - in v18, all P6 architecture chips are identified as Pentium Pros. Note, when you upgrade a v18 client to v19, you should check the CPU type in Options/CPU - I don't think it checks automatically, except on initial installation. Regards Brian Beesley _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Fri, 17 Sep 1999 17:53:39 +0300 From: Jukka Santala <[EMAIL PROTECTED]> Subject: Mersenne: P-1 Well, see the topic. In other words, is it possible to get a quick "P-1 factoring for dummies" rundown? At which point is P-1 factoring worth the effort? Does it have any overlappign with ECM? How should the bounds be chosen for any given exponent, and will higher bounds always find any factors smaller bounds would have, or is there an advantage to running lower bounds? As I understand, every run with _same_ bounds would find the same factors? -Donwulff _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Fri, 17 Sep 1999 10:58:20 -0400 (EDT) From: "St. Dee" <[EMAIL PROTECTED]> Subject: Re: Mersenne: Celerons vs. Pentium II/III at large FFT lengths? On Fri, 17 Sep 1999, Brian J. Beesley wrote: > On 16 Sep 99, at 18:35, Lucas Wiman wrote: > > > > This brings us to an interesting point. Should the primenet server start > > default assigning celeron's <384K FFT mersennes, and save the larger ones > > for PII's/PIII's? > > No. Whatever the problem was (I _did_ manage to duplicate it on my > Celeron 366 laptop using v18 - a 448K FFT was actually _slower_ than > a 512K FFT on the same system!) v19 does not suffer from it. And the > PrimeNet server doesn't recognise a Celeron unless v19 is running - > in v18, all P6 architecture chips are identified as Pentium Pros. So I guess I should have my Celery's test larger exponents, at least until I upgrade to V19...;-) Kel _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Fri, 17 Sep 1999 19:57:06 +0300 From: Jukka Santala <[EMAIL PROTECTED]> Subject: Mersenne: v19 DNS(?) crash... v19 is giving me trouble now. When I try to start it up, it says "Contacting PrimeNet Server", hangs up there for a while and then crashes with Application Error "The exception unknown software exception (0x000006ba) occured in the application at location 0x77e1fc45". I don't need to go thru the debugger to guess this is because the university network I'm using seems to be cut from USA currently, ie. DNS queries among other things fail. -Donwulff _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Fri, 17 Sep 1999 23:38:10 +0200 From: "Steinar H. Gunderson" <[EMAIL PROTECTED]> Subject: Mersenne: Timing(?) errors All, I'm running Prime95 v18 on a Dell XPS P60 (no, I don't want to hear `switch to factoring' or anything -- it's running double-checks). When activity happens (in this case a Word document being opened and looked at), sometimes the log shows stuff like: [lots of 0.814 sec iteration times] Iteration: 3077000 / 3644xxx. Clocks: [48.8 million] = 0.814 sec. Iteration: 3078000 / 3644xxx. Clocks: [56.1 million] = 0.936 sec. <-- Word Iteration: 3079000 / 3644xxx. Clocks: [46.7 million] = 0.779 sec. Iteration: 3080000 / 3644xxx. Clocks: [47.0 million] = 0.784 sec. Iteration: 3081000 / 3644xxx. Clocks: [46.7 million] = 0.779 sec. Anybody have an idea about why it actually is faster now? My best guess would be some kind of timing mistake, but it still doesn't sound right... (The computer has been left totally untouched after the Word usage. The Word usage shows itself in the 0.936 timing. The machine should have enough mem -- 40 MB. Running Win95, no evil CPU-hogging programs.) /* Steinar */ - -- Homepage: http://members.xoom.com/sneeze/ _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Fri, 17 Sep 1999 17:37:38 EDT From: [EMAIL PROTECTED] Subject: Mersenne: G4: real or hype? Jonathan Zylstra wrote about the new Apple G4 processor (see quotes below). The G4 is really a hybrid combination of the basic PowerPC CPU with a 128-bit vector unit called the AltiVec. Richard Crandall (who consults for Apple) and Jason Klivington (of Apple) used a beta version of the G4 to do some of the all-integer verification of our (Crandall, Mayer, Papadopoulos) F24 (24th Fermat number) project. Note that we didn't get a chance to test the floating-point capabilities of the G4 - both main "wavefront runs of the Pe'pin test of F24 were on other hardware (a 250MHz MIPS R10000 and a 167MHz SPARC Ultra-1), both achieved a length-1 million FFT-based squaring time in under 1 second, whereas the all-integer version needed about a 5 times as long on a 400 MHz G4 - one can't conclude anything from that, since it's like comparing apples and oranges. Based on the technical specs and the relative all-integer timings on the G4 vs. the Pentium (with similar amounts of code optimization, the G4 looks to be somewhat faster than a Pentium, but by less than a factor of 2), it looks like a good processor, but all the ballyhoo needs to be put into perspective. Before I continue, a disclaimer: I neither work nor consult for any computer manufacturer. My taste in processors is simple: the faster, the better. <<<"sustained performance of 1 gigaflop." (which makes the G4 a 'supercomputer' ) "theoretical performance of 4 gigaflops" "It is a 128 bit processor, and can perform 4 ( sometimes 8 ) 32bit = floating pt. calculations per cycle." "It is 3 times faster then the PIII 600Mhz">>> While all of this may be technically true, it's also very misleading: - - The 1 Gflop figure comes from the fact that the G4 can in theory dispatch 2 floating-point operations (1 mul, 1 add) per cycle, so at 500 MHz that equals 1 Gflop. I defy you to get that kind of performance out of, say, a code for a large FFT-with very careful coding, you may get half that. The above FP capabilities are not qualitatively different than for most other current high-end processors (Pentium, SPARC, MIPS, Alpha) and are slightly less than the AMD K7, which can dispatch up to 3 FP ops per cycle (2 adds, 1 mul). Thus, by the same reasoning, AMD could legitimately claim 2 Gflops for a 667 MHz K7. Never mind that one will only see that kind of performance for perfectly balanced, perfectly pipelined code whose data never leave the FP registers. - - 128-bit: also misleading. The AltiVec vector unit can do some fairly nice operations in which a 128-bit integer operand is treated as a vector of 4, 8, or 16 operands of 32, 16, and 8 bits, respectively, and can do a nice variety of 4x32-bit vector FP operations, but in my opinion that is far from constituting a true 128-bit CPU. The above enhancements are qualitatively similar to the Pentium MMX enhancements - useful for things like multimedia, but nearly useless when one is doing serious math with 64-bit operands. I say "nearly" since one can, e.g. build various 64-bit integer operations out of ones on shorter operands, but it's a pain, say, if one wants a 64-x64==>128-bit integer multiply, which is potentially more usefull for LL testing than parallel 16x16==>32-bit multiplies. The AltiVec 4x32-bit floats, like the similar MMX intructions, are not very useful for FFT-based large- integer arithmetic - not enough precision. - - "It is 3 times faster then the PIII 600Mhz." Based on what? Give us some SPECint or SPECfp figures that support this before making such claims. The only datum provided (below) indicates a speedup of 1.45x, far less than 3x. <<<On the comparison table between the PIII and G4, they show this: Test: PIII Clock Cycles G4 Clock Cycles G4 = Performance <- (Adjusted for MHz) 256 Pt. FFT 6.94 4 = 1.74x better than PIII 1.45x faster than PIII>>> In what way is a tiny 256pt FFT a good indicator of overall system performance? Was this single precision (I suspect so) or double? Was it specially coded to use the 4x32-bit FP ops supported by the G4? (I suspect so.) If so, was similar coding attempted to use the PIII MMX instructions? (I suspect not.) What numbers emerge when the figures are adjusted not just for MHz, but also for price? I've also seen some of Apples's ads in the San Jose Mercury News - the phrase "The fastest desktop computer on earth" was used. Apparently this depends on one's definition of both "fastest" an of "desktop computer," (perhaps even of "Earth." :) I can buy a desktop Alpha 21264 which probably blows the G4 out of the water, performance-wise (and there, we HAVE some SPEC numbers to guide us) on 95% of generic compiled code, say in Numerical Recipes or the SPEC benchmark suites. So again, folks at Apple, please back your claims up with performance data based on real-world code, the kind most programmers really write, that tests more of the instruction set (not just the special goodies you included) as well as the entire memory system (not just the registers and L1 cache.) Now, if the G4 demonstrates a SPEC FP of over 50 at 500MHz, and a SPEC INT of over 30, then the "fastest on earth" claim will be more believable. None of which is to say it's not a darn good processor - but let's keep the descriptive language within the realm of the reasonable, shall we? Some related URLs (thanks to Jason Klivington of Apple for these): A technical discussion of the instruction set: http://www.motorola.com/SPS/PowerPC/teksupport/teklibrary/altivec_pem.pdf A description of the C implementation of the AltiVec instructions: http://developer.apple.com/hardware/altivec/pdf/altivec_support.pdf Motorola's general AltiVec site: http://www.mot.com/SPS/PowerPC/AltiVec Have fun, - -Ernst _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Fri, 17 Sep 1999 19:59:13 -0400 From: "Chris Nash" <[EMAIL PROTECTED]> Subject: Re: Mersenne: P-1 Hi there > Well, see the topic. In other words, is it possible to get a quick "P-1 > factoring for dummies" rundown? Sure, let's give it a try. Suppose we do some calculations modulo some number N. In effect we're actually doing all the calculations modulo all the prime factors of N, but of course we don't know what those prime factors are. But if we could find a calculation for which one of the prime factors would 'disappear', it would help us find that factor. Fermat's little theorem tells us that a^(p-1)=1 mod p, for all primes p. In fact, for each a there is a smallest number x>0 such that a^x=1 mod p (the exponent of a modulo p). All we usually know about x is it divides p-1 The idea behind P-1 factoring is this, if we compute R=a^(some big very composite number)-1 mod N if our 'big composite number' is divisible by x, what is left will be divisible by some unknown factor p of N, and we could find p by examining gcd(R,N). Usually the big composite number is a product of powers of many small primes, so it has many factors and there is a good chance the unknown x (which is probably p-1) is a factor of it. > At which point is P-1 factoring worth the effort? Probably as soon as your factoring attempts have exceeded what the machine is comfortable with. If you've reached 2^32, try P-1 for a little while. Trial-factoring will be slower if you carried on trying factors, also your probability of success with trial-factoring large numbers is extremely low. (p divides random N with probability 1/p). Of course P-1 may fail. You may have to go a very long way before the unknown x divides your big composite number - what if x itself has a large prime factor? P-1 would not find it until your P-1 loop reached that prime (in other words, your limit has to be bigger than the biggest prime factor of the unknown x). However there are factors that P-1 finds 'easily', and even a failed P-1 run can tell you a little more information about the number which might help if you try some more trial-factoring. (you know for instance that any factor must have some prime, or power of prime, factor of p-1 above the limit). > Does it have any overlappign with ECM? The theory's very similar. Both algorithms attempt to find a point where one of the unknown prime factors 'disappears' from the calculation. However ECM has better odds. In P-1 attempts, the unknown x is fixed. You can't do anything about it, and even if you try using a different base a, you're very likely going to have a similar x with the same problems (a large factor). In ECM, choosing a different curve could give an equivalent 'x' value that varies a lot. One of those 'x' values may disappear quite quickly from the calculations. (but again, with large values it could be a long time before that happens). Of course the steps involved in ECM factoring are a little more complex than P-1... > How should the bounds > be chosen for any given exponent, and will higher bounds always find any > factors smaller bounds would have, or is there an advantage to running > lower bounds? As I understand, every run with _same_ bounds would find > the same factors? In theory you can change the base and the bounds. Changing the base often has little or no effect, unless you are incredibly lucky (of course, 'obvious' bases related to the number are likely to be of little use - don't use base a=2 for Mersennes!). Changing the bounds though can make the difference between finding a factor, and not finding one. We may fail when we test a^(some small composite number)-1 but we may succeed when we test a^((some small composite number)*(some larger composite number))-1 By writing it like this you can also see that the larger bound succeeds if ever the smaller bound does. (the top line always divides the bottom line, so any factors of the top also appear at the bottom). Of course larger bounds take longer to calculate, and there is also a possibility that larger bounds would find "more than one" factor in one run. Ideally you check periodically through the calculation to see if you have already met a factor, but that might take time. The overriding "decision factor" is based purely on the time you're willing to spend. Factoring being explicitly harder than primality testing, you might be happy with, say, spending 10 times as long searching for a factor as you would on a proof the number was composite. So you might find "some very big composite number" 10 times the bit length of N was acceptable. 10 times the bit length of N is a good ballpark estimate for the bounds setting for the P-1 test to get that sort of time required. Of course if you were willing to spend 100, 1000 times as long, you could set the bounds higher... but in that case, bear in mind that the P-1 test is often unsuccessful. If you have that much time to spend, you might prefer to dedicate it to a more sophisticated algorithm. Just like trial-factoring, you have to increase the bounds *A LOT* before your chances of success improve significantly - ultimately they are related very closely, because success depends on the factors of the unknown magic number x. Hope this helps Chris Nash Lexington KY UNITED STATES _________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ End of Mersenne Digest V1 #627 ******************************