Mersenne: Re: New beta mers release: new Lucas-Lehmer program
On Tue, Oct 05, 1999 at 06:19:20PM +0200, Guillermo Ballester Valor wrote: >ii) On the other hand, FFTW does not use 'register' at all. All local >variables are stored on stack. I don't know much about compilers, but >perhaps some good compilers can use the register storage as speed >optimization. gcc should be able to, at least newer versions (post-egcs phase). >Looking at the code generated by gnu-gcc on intel >processors, some local double variables are stored on intel fpu and the >performance is so good. Try Pentium GCC once (http://www.goof.com/pcg/). It has some MMX support built-in. If you run out of integer registers (and don't do float), it's interestingly enough storing data in MMX registers. On the other hand, I think MacLucasUNIX will use a _lot_ of floats. It will even compile on non-Intel machines (I think...), but I've got _no_ clue at all (I think nobody has...) about the performance. Volunteers? >My question is: What can happen in FFTW code if we directly include >'register' keys management on its local temporal variable definitions?. >This sort of things can be made with a single compiler option?. -Dregister= will do the trick and remove _all_ register keywords. They're a bad thing in general. The compiler should be able to decide for itself which variables that are to be put in registers. const, on the other hand, should be used as much as possible (think `const struct foo * const * const bar' here...). But I guess they wouldn't have overlooked such a textbook rule... >> The other line of approach I have on improving MacLucasUNIX is to try >> Digital's native C compiler - the linux beta is currently available >> FOC, but unfortunately I will have to upgrade linux to run it as it >> requires 5.2 or later. Linux 5.2? Surely, you must be referring to RedHat 5.2? Upgrading your libc from scratch isn't all that hard, if you can do with a bit of tweaking. /* Steinar */ -- Homepage: http://members.xoom.com/sneeze/ _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne Digest V1 #637
Mersenne DigestTuesday, October 5 1999Volume 01 : Number 637 -- Date: Sun, 3 Oct 1999 14:33:51 -0400 From: "Matthew Smith" <[EMAIL PROTECTED]> Subject: Mersenne: Mprime This is a multi-part message in MIME format. - --=_NextPart_000_000D_01BF0DAC.4FC96D40 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I really want to run the Linux prime search software when I'm in Red Hat = 5.2 (I'm usually in Windows 98), but the program seems to have no = documentation or information once it's running. I ran it for 4 hours = once and had no idea how to close it. I had to use "kill," the only = command I could think of to close it. When I got back to Windows, I saw = that my work had not been saved. I had set the option in Windows for = disk saves every 15 minutes. Mprime and Prime95 shared the same data = files, like they are supposed to. What's going on? I'd vote for an = mprime on an X-server. I want to see SOMETHING while the thing's going. = At least show me the iteration #, iteration % of total, and seconds / = iteration. - - "The brave do not fear the grave." --Battle Square, Final Fantasy VII - - --=_NextPart_000_000D_01BF0DAC.4FC96D40 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I really want to run the Linux prime = search=20 software when I'm in Red Hat 5.2 (I'm usually in Windows 98), but the = program=20 seems to have no documentation or information once it's running. I = ran it=20 for 4 hours once and had no idea how to close it. I had to use = "kill," the=20 only command I could think of to close it. When I got back to = Windows, I=20 saw that my work had not been saved. I had set the option in = Windows for=20 disk saves every 15 minutes. Mprime and Prime95 shared the same = data=20 files, like they are supposed to. What's going on? I'd vote = for an=20 mprime on an X-server. I want to see SOMETHING while the thing's=20 going. At least show me the iteration #, iteration % of total, and = seconds=20 / iteration. = - - "The brave do not fear the = grave." --Battle=20 Square, Final Fantasy VII - --=_NextPart_000_000D_01BF0DAC.4FC96D40-- _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers -- Date: Sun, 3 Oct 1999 15:39:53 -0400 (EDT) From: Lucas Wiman <[EMAIL PROTECTED]> Subject: Re: Mersenne: Mprime > I really want to run the Linux prime search software when I'm in Red Hat 5.2 (I'm >usually in Windows 98), but the program seems to have no documentation or information >once it's running. I ran it for 4 hours once and had no idea how to close it. I had >to use "kill," the only command I could think of to close it. When I got back to >Windows, I saw that my work had not been saved. I had set the option in Windows for >disk saves every 15 minutes. Mprime and Prime95 shared the same data files, like >they are supposed to. What's going on? I'd vote for an mprime on an X-server. I >want to see SOMETHING while the thing's going. At least show me the iteration #, >iteration % of total, and seconds / iteration. (could everyone please set their mailer to wrap at <80 charactors) Try mprime -m then choose 6. Test/Continue It seems a bit odd that it didn't save work when it was killed. I thought it was supposed to. What signal did you send the process? Try CTRL+C if kill continually keeps it from saving. - -Lucas _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers -- Date: Sun, 03 Oct 1999 21:35:49 GMT From: [EMAIL PROTECTED] (Michael Oates) Subject: Re: Mersenne: Mprime On Sun, 3 Oct 1999 15:39:53 -0400 (EDT), you wrote: >> I really want to run the Linux prime search software when I'm in Red Hat 5.2 (I'm >usually in Windows 98), but the program seems to have no documentation or information >once it's running. I ran it for 4 hours once and had no idea how to close it. I had >to use "kill," the only command I could think of to close it. When I got back to >Windows, I saw that my work had not been saved. I had set the option in Windows for >disk saves every 15 minutes. Mprime and Prime95 shared the same data files, like >they are supposed to. What's going on? I'd vote for an mprime on an X-server. I >want to see SOMETHING while the thing's
Re: Mersenne: New beta mers release: new Lucas-Lehmer program
On 5 Oct 99, at 18:19, Guillermo Ballester Valor wrote: > Well, I'm wonder the reason of such diferent performance. On intel > machines MacLucasFFTW runs more than twice faster than MacLucas and on > RISC processor MacLucasUNIX is better than MacLucasFFTW. Looking at the > code, without deep understanding, one can see: > > i) MacLucasUNIX uses intensively the 'register' key in local > definitions, so a processor with many internal registers can allocate > most of them. It is a good thing because they can be accessed very fast. > The bad thing that is that in processors with very few registers (like > intel's) it can slowdown the speed. And (according to a local computer scientist, who I think knows what he's talking about) with modern processors making extensive use of register renaming, it's not usually sensible to use the "register" keyword _at all_. The theory is that the instruction scheduler can do at least as good a job as the programmer - it gets more choice, anyway, e.g. there are 40 32-bit general-purpose registers in the Intel PPro, but only a few have "names" at any given time. > > ii) On the other hand, FFTW does not use 'register' at all. All local > variables are stored on stack. I don't know much about compilers, but > perhaps some good compilers can use the register storage as speed > optimization. Looking at the code generated by gnu-gcc on intel > processors, some local double variables are stored on intel fpu and the > performance is so good. Storing data on a stack is not very efficient in most RISC architectures - you tend to cause problems with cache alignment, overloading cache lines causing high miss rates, etc. The small caches on the Alpha 21164 design possibly contribute to this - the L1 data cache is only 8K bytes & the L2 cache is only 96K bytes (but there can be a L3 cache which is at least 2M bytes, if fitted). The Intel FPU is a special case! > > My question is: What can happen in FFTW code if we directly include > 'register' keys management on its local temporal variable definitions?. > This sort of things can be made with a single compiler option?. > > I did it. I've included register managements on all FFTW radix routines > up to radix-16 (which need no more than 32 stack variables). For intel > machines the code is untouched (because I previously defined REG as a > void comment) . But I'm not the owner of a RISC machine so I have no > idea about its performance. any volunteer?. > Sure, I'll give it a go. Just mail me the source ... I've access to a Sun Ultra 10 as well (running Solaris, but with the gcc compiler, not Sun's own). > > Any improvement on MacLucas is desired. Any improvement on _anything_ is desireable !!! Actually MacLucasUNIX on my Alpha system isn't bad, compiled from pure C source with gcc it gives Prime95/mprime running on a PII-333 a good run for its money (a bit faster, or a bit slower, depending on the exponent). Given the brilliant optimization George has done for the Intel CPU, I think this is quite good. I'm pretty sure I could at least double the speed of MacLucasUNIX on the Alpha, by replacing critical chunks of code with hand-tuned assembler, but the investment in terms of time & effort is too much for me 8-( > > I think we can sqeeze FFTW a lot more. I like its code very much. The > good performance on intel (45% with respect mprime) is good enought to > work a litle more on it. I agree - in particular, there's an obvious gain in being able to do FFT with run lengths other than powers of 2, once you have the speed in the same ballpark. Nevertheless, I think FFTW will be hard pushed to match mprime on 32-bit Intel architecture systems. There is an obvious need for something reasonably efficient and portable, if only to be able to take advantage of new processor designs (like Merced, and to a lesser extent Athlon) without having to expend very large amounts of effort in hand-optimization. Regards Brian Beesley _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: forgot my password
Okay, the one machine I had set up the gimps client on is currently.. well, the hard drive is in the mail :( I'm setting it up on another computer, & I have no idea what password was assigned to me. I'm used to being able to click an "I forgot my password, please email it to me" button. __ PGP fingerprint = 03 5B 9B A0 16 33 91 2F A5 77 BC EE 43 71 98 D4 [EMAIL PROTECTED] / http://www.op.net/~darxus Join the Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: From elsewhere, on network protocols being readable
Harrumph. ___ David Nicol 816.235.1187 [EMAIL PROTECTED] ./configure && make && make test Grant Edwards <[EMAIL PROTECTED]>: > I'm sure most people on this list already realize it, but whoever > first decided to implement things like this using ASCII line-oriented > protocols on telnettable TCP/IP ports deserves free doughnuts for the > rest of his/her life. > > Being able to implement/test servers and clients with expect, telnet, > etc. has undoubtedly save thousands of hours of development time.. You speak truly indeed. I have a good bit to say about this in my next book, "The Art Of Unix Programming". -- http://www.tuxedo.org/~esr">Eric S. Raymond The IRS has become morally corrupted by the enormous power which we in Congress have unwisely entrusted to it. Too often it acts like a Gestapo preying upon defenseless citizens. -- Senator Edward V. Long TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to [EMAIL PROTECTED] . Trouble? e-mail to [EMAIL PROTECTED] .
Re: Mersenne: New beta mers release: new Lucas-Lehmer program
Hi: "Brian J. Beesley" wrote: > Well, without the tunefftw program, I did manage to make MacLucasFFTW > on my Alpha system & it does pass the self-tests. However, the > performance is not particularly brilliant. For a complete test of > exponent 11213 using MacLucasUNIX (FFT run length 1024) the CPU time > is 3.45 seconds, but MacLucasFFTW takes 4.53 seconds CPU using a FFT > run length of 640. (This is an Alpha 21164PC at 533 MHz with 2 MB L3 > cache and 256 MB SDRAM, running Red Hat Linux 5.1) > Well, I'm wonder the reason of such diferent performance. On intel machines MacLucasFFTW runs more than twice faster than MacLucas and on RISC processor MacLucasUNIX is better than MacLucasFFTW. Looking at the code, without deep understanding, one can see: i) MacLucasUNIX uses intensively the 'register' key in local definitions, so a processor with many internal registers can allocate most of them. It is a good thing because they can be accessed very fast. The bad thing that is that in processors with very few registers (like intel's) it can slowdown the speed. ii) On the other hand, FFTW does not use 'register' at all. All local variables are stored on stack. I don't know much about compilers, but perhaps some good compilers can use the register storage as speed optimization. Looking at the code generated by gnu-gcc on intel processors, some local double variables are stored on intel fpu and the performance is so good. My question is: What can happen in FFTW code if we directly include 'register' keys management on its local temporal variable definitions?. This sort of things can be made with a single compiler option?. I did it. I've included register managements on all FFTW radix routines up to radix-16 (which need no more than 32 stack variables). For intel machines the code is untouched (because I previously defined REG as a void comment) . But I'm not the owner of a RISC machine so I have no idea about its performance. any volunteer?. > The other line of approach I have on improving MacLucasUNIX is to try > Digital's native C compiler - the linux beta is currently available > FOC, but unfortunately I will have to upgrade linux to run it as it > requires 5.2 or later. (I think the version of libc is the critical > factor.) The principle being that, when it comes to squeezing > performance out of an Alpha CPU, the people who developed the Alpha > architecture may well do a better job than the people who develop > gcc. > Any improvement on MacLucas is desired. I think we can sqeeze FFTW a lot more. I like its code very much. The good performance on intel (45% with respect mprime) is good enought to work a litle more on it. Regards | Guillermo Ballester Valor | | [EMAIL PROTECTED] | | c/ cordoba, 19 | | 18151-Ogijares (Spain) | | (Linux registered user 1171811) | _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: Online proof of Lucas-Lehmer
The Lucas-Lehmer test seems pretty magical to me and I have wanted to see a full proof of the theorem for some time. For a long time, there has been a proof online that M(p) divides the term in the LL sequence means that M(p) is prime but I had still never seen the other half. So I worked out the rest of a proof. If interested, please take a look at http://www.jt-actuary.com/lucas-le.htm. I see one "typo" where the "less than" symbol (<) displays incorrectly online. There may be other typos and even errors. (Well, I can hope not...) If this seems readable, would anyone want it linked in to a FAQ? (Not my call. I just wanted a proof.) Trivial note: back in the summer of 1968, I was one of a bunch of high school kids who met DH Lehmer (son of the Lehmer of LL fame, also a UC Berkeley math professor) in the basement of one of the engineering buildings at Berkeley. (It may have been a math building then, but the location is now an engineering building and the math buildings were in three other spots even then.) He showed us his "DLS-127" (Delay Line Sieve). This was one of the best prime-crunchers of the 1960's. As I recall, it was an ANALOG computer based on very precise inductors. Thanks, Joth _ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers