Mersenne: Re: New beta mers release: new Lucas-Lehmer program

1999-10-05 Thread Steinar H. Gunderson

On Tue, Oct 05, 1999 at 06:19:20PM +0200, Guillermo Ballester Valor wrote:
>ii) On the other hand, FFTW does not use 'register' at all. All local
>variables are stored on stack. I don't know much about compilers, but
>perhaps some good compilers can use the register storage as speed
>optimization.

gcc should be able to, at least newer versions (post-egcs phase). 

>Looking at the code generated by gnu-gcc on intel
>processors, some local double variables are stored on intel fpu and the
>performance is so good. 

Try Pentium GCC once (http://www.goof.com/pcg/). It has some MMX support
built-in. If you run out of integer registers (and don't do float), it's
interestingly enough storing data in MMX registers. On the other hand,
I think MacLucasUNIX will use a _lot_ of floats.

It will even compile on non-Intel machines (I think...), but I've got
_no_ clue at all (I think nobody has...) about the performance. Volunteers?

>My question is: What can happen in FFTW code if we directly include
>'register' keys management on its local temporal variable definitions?.
>This sort of things can be made with a single compiler option?. 

-Dregister= will do the trick and remove _all_ register keywords. They're
a bad thing in general. The compiler should be able to decide for itself
which variables that are to be put in registers.

const, on the other hand, should be used as much as possible (think `const
struct foo * const * const bar' here...). But I guess they wouldn't have
overlooked such a textbook rule...

>> The other line of approach I have on improving MacLucasUNIX is to try
>> Digital's native C compiler - the linux beta is currently available
>> FOC, but unfortunately I will have to upgrade linux to run it as it
>> requires 5.2 or later.

Linux 5.2?

Surely, you must be referring to RedHat 5.2? Upgrading your libc from
scratch isn't all that hard, if you can do with a bit of tweaking.

/* Steinar */
-- 
Homepage: http://members.xoom.com/sneeze/
_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne Digest V1 #637

1999-10-05 Thread Mersenne Digest


Mersenne DigestTuesday, October 5 1999Volume 01 : Number 637




--

Date: Sun, 3 Oct 1999 14:33:51 -0400
From: "Matthew Smith" <[EMAIL PROTECTED]>
Subject: Mersenne: Mprime

This is a multi-part message in MIME format.

- --=_NextPart_000_000D_01BF0DAC.4FC96D40
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

I really want to run the Linux prime search software when I'm in Red Hat =
5.2 (I'm usually in Windows 98), but the program seems to have no =
documentation or information once it's running.  I ran it for 4 hours =
once and had no idea how to close it.  I had to use "kill," the only =
command I could think of to close it.  When I got back to Windows, I saw =
that my work had not been saved.  I had set the option in Windows for =
disk saves every 15 minutes.  Mprime and Prime95 shared the same data =
files, like they are supposed to.  What's going on?  I'd vote for an =
mprime on an X-server.  I want to see SOMETHING while the thing's going. =
 At least show me the iteration #, iteration % of total, and seconds / =
iteration.

- -

"The brave do not fear the grave."

--Battle Square, Final Fantasy VII

- 

- --=_NextPart_000_000D_01BF0DAC.4FC96D40
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable








I really want to run the Linux prime =
search=20
software when I'm in Red Hat 5.2 (I'm usually in Windows 98), but the =
program=20
seems to have no documentation or information once it's running.  I =
ran it=20
for 4 hours once and had no idea how to close it.  I had to use =
"kill," the=20
only command I could think of to close it.  When I got back to =
Windows, I=20
saw that my work had not been saved.  I had set the option in =
Windows for=20
disk saves every 15 minutes.  Mprime and Prime95 shared the same =
data=20
files, like they are supposed to.  What's going on?  I'd vote =
for an=20
mprime on an X-server.  I want to see SOMETHING while the thing's=20
going.  At least show me the iteration #, iteration % of total, and =
seconds=20
/ iteration.
=
- -
 
"The brave do not fear the =
grave."
 
    --Battle=20
Square, Final Fantasy VII
 


- --=_NextPart_000_000D_01BF0DAC.4FC96D40--

_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

--

Date: Sun, 3 Oct 1999 15:39:53 -0400 (EDT)
From: Lucas Wiman  <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Mprime

> I really want to run the Linux prime search software when I'm in Red Hat 5.2 (I'm 
>usually in Windows 98), but the program seems to have no documentation or information 
>once it's running.  I ran it for 4 hours once and had no idea how to close it.  I had 
>to use "kill," the only command I could think of to close it.  When I got back to 
>Windows, I saw that my work had not been saved.  I had set the option in Windows for 
>disk saves every 15 minutes.  Mprime and Prime95 shared the same data files, like 
>they are supposed to.  What's going on?  I'd vote for an mprime on an X-server.  I 
>want to see SOMETHING while the thing's going.  At least show me the iteration #, 
>iteration % of total, and seconds / iteration.

(could everyone please set their mailer to wrap at <80 charactors)

Try mprime -m 
then choose 6.  Test/Continue

It seems a bit odd that it didn't save work when it was killed.
I thought it was supposed to.  What signal did you send the process?
Try CTRL+C if kill continually keeps it from saving.

- -Lucas
_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

--

Date: Sun, 03 Oct 1999 21:35:49 GMT
From: [EMAIL PROTECTED] (Michael Oates)
Subject: Re: Mersenne: Mprime

On Sun, 3 Oct 1999 15:39:53 -0400 (EDT), you wrote:

>> I really want to run the Linux prime search software when I'm in Red Hat 5.2 (I'm 
>usually in Windows 98), but the program seems to have no documentation or information 
>once it's running.  I ran it for 4 hours once and had no idea how to close it.  I had 
>to use "kill," the only command I could think of to close it.  When I got back to 
>Windows, I saw that my work had not been saved.  I had set the option in Windows for 
>disk saves every 15 minutes.  Mprime and Prime95 shared the same data files, like 
>they are supposed to.  What's going on?  I'd vote for an mprime on an X-server.  I 
>want to see SOMETHING while the thing's 

Re: Mersenne: New beta mers release: new Lucas-Lehmer program

1999-10-05 Thread Brian J. Beesley

On 5 Oct 99, at 18:19, Guillermo Ballester Valor wrote:

> Well, I'm wonder the reason of such diferent performance. On intel
> machines MacLucasFFTW runs more than twice faster than MacLucas and on
> RISC processor MacLucasUNIX is better than MacLucasFFTW. Looking at the
> code, without deep understanding, one can see:
> 
> i) MacLucasUNIX uses intensively the 'register' key in local
> definitions, so a processor with many internal registers can allocate
> most of them. It is a good thing because they can be accessed very fast.
> The bad thing that is that in processors with very few registers (like
> intel's) it can slowdown the speed. 

And (according to a local computer scientist, who I think knows what 
he's talking about) with modern processors making extensive use of 
register renaming, it's not usually sensible to use the "register" 
keyword _at all_. The theory is that the instruction scheduler can do 
at least as good a job as the programmer - it gets more choice, 
anyway, e.g. there are 40 32-bit general-purpose registers in the 
Intel PPro, but only a few have "names" at any given time.
> 
> ii) On the other hand, FFTW does not use 'register' at all. All local
> variables are stored on stack. I don't know much about compilers, but
> perhaps some good compilers can use the register storage as speed
> optimization. Looking at the code generated by gnu-gcc on intel
> processors, some local double variables are stored on intel fpu and the
> performance is so good. 

Storing data on a stack is not very efficient in most RISC 
architectures - you tend to cause problems with cache alignment, 
overloading cache lines causing high miss rates, etc. The small 
caches on the Alpha 21164 design possibly contribute to this - the L1 
data cache is only 8K bytes & the L2 cache is only 96K bytes (but 
there can be a L3 cache which is at least 2M bytes, if fitted).

The Intel FPU is a special case! 
> 
> My question is: What can happen in FFTW code if we directly include
> 'register' keys management on its local temporal variable definitions?.
> This sort of things can be made with a single compiler option?. 
> 
> I did it. I've included register managements on all FFTW radix routines
> up to radix-16 (which need no more than 32 stack variables). For intel
> machines the code is untouched (because I previously defined REG as a
> void comment) . But I'm not the owner of a RISC machine so I have no
> idea about its performance. any volunteer?.
> 
Sure, I'll give it a go. Just mail me the source ... I've access to a 
Sun Ultra 10 as well (running Solaris, but with the gcc compiler, not 
Sun's own).
>
> Any improvement on MacLucas is desired.

Any improvement on _anything_ is desireable !!! Actually MacLucasUNIX 
on my Alpha system isn't bad, compiled from pure C source with gcc it 
gives Prime95/mprime running on a PII-333 a good run for its money (a 
bit faster, or a bit slower, depending on the exponent). Given the 
brilliant optimization George has done for the Intel CPU, I think 
this is quite good. I'm pretty sure I could at least double the speed 
of MacLucasUNIX on the Alpha, by replacing critical chunks of code 
with hand-tuned assembler, but the investment in terms of time & 
effort is too much for me 8-(
> 
> I think we can sqeeze FFTW a lot more. I like its code very much. The
> good performance on intel (45% with respect mprime) is good enought to
> work a litle more on it. 

I agree - in particular, there's an obvious gain in being able to do 
FFT with run lengths other than powers of 2, once you have the speed 
in the same ballpark. Nevertheless, I think FFTW will be hard pushed 
to match mprime on 32-bit Intel architecture systems.

There is an obvious need for something reasonably efficient and 
portable, if only to be able to take advantage of new processor 
designs (like Merced, and to a lesser extent Athlon) without having 
to expend very large amounts of effort in hand-optimization.

Regards
Brian Beesley
_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: forgot my password

1999-10-05 Thread Darxus


Okay, the one machine I had set up the gimps client on is currently..
well, the hard drive is in the mail :(

I'm setting it up on another computer, & I have no idea what password was
assigned to me.  I'm used to being able to click an "I forgot my password,
please email it to me" button.

__
PGP fingerprint = 03 5B 9B A0 16 33 91 2F  A5 77 BC EE 43 71 98 D4
[EMAIL PROTECTED] / http://www.op.net/~darxus
  Join the Great Internet Mersenne Prime Search
http://www.mersenne.org/prime.htm


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: From elsewhere, on network protocols being readable

1999-10-05 Thread David L. Nicol


Harrumph. 
___
   David Nicol 816.235.1187 [EMAIL PROTECTED]
   ./configure && make && make test


Grant Edwards <[EMAIL PROTECTED]>:
> I'm sure most people on this list already realize it, but whoever
> first decided to implement things like this using ASCII line-oriented
> protocols on telnettable TCP/IP ports deserves free doughnuts for the
> rest of his/her life.
> 
> Being able to implement/test servers and clients with expect, telnet,
> etc. has undoubtedly save thousands of hours of development time..

You speak truly indeed.  I have a good bit to say about this in my next
book, "The Art Of Unix Programming".
-- 
http://www.tuxedo.org/~esr">Eric S. Raymond

The IRS has become morally corrupted by the enormous power which we in
Congress have unwisely entrusted to it. Too often it acts like a
Gestapo preying upon defenseless citizens.
-- Senator Edward V. Long

TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .





Re: Mersenne: New beta mers release: new Lucas-Lehmer program

1999-10-05 Thread Guillermo Ballester Valor

Hi:

"Brian J. Beesley" wrote:
> Well, without the tunefftw program, I did manage to make MacLucasFFTW
> on my Alpha system & it does pass the self-tests. However, the
> performance is not particularly brilliant. For a complete test of
> exponent 11213 using MacLucasUNIX (FFT run length 1024) the CPU time
> is 3.45 seconds, but MacLucasFFTW takes 4.53 seconds CPU using a FFT
> run length of 640. (This is an Alpha 21164PC at 533 MHz with 2 MB L3
> cache and 256 MB SDRAM, running Red Hat Linux 5.1)
>
 
Well, I'm wonder the reason of such diferent performance. On intel
machines MacLucasFFTW runs more than twice faster than MacLucas and on
RISC processor MacLucasUNIX is better than MacLucasFFTW. Looking at the
code, without deep understanding, one can see:

i) MacLucasUNIX uses intensively the 'register' key in local
definitions, so a processor with many internal registers can allocate
most of them. It is a good thing because they can be accessed very fast.
The bad thing that is that in processors with very few registers (like
intel's) it can slowdown the speed. 

ii) On the other hand, FFTW does not use 'register' at all. All local
variables are stored on stack. I don't know much about compilers, but
perhaps some good compilers can use the register storage as speed
optimization. Looking at the code generated by gnu-gcc on intel
processors, some local double variables are stored on intel fpu and the
performance is so good. 

My question is: What can happen in FFTW code if we directly include
'register' keys management on its local temporal variable definitions?.
This sort of things can be made with a single compiler option?. 

I did it. I've included register managements on all FFTW radix routines
up to radix-16 (which need no more than 32 stack variables). For intel
machines the code is untouched (because I previously defined REG as a
void comment) . But I'm not the owner of a RISC machine so I have no
idea about its performance. any volunteer?.


> The other line of approach I have on improving MacLucasUNIX is to try
> Digital's native C compiler - the linux beta is currently available
> FOC, but unfortunately I will have to upgrade linux to run it as it
> requires 5.2 or later. (I think the version of libc is the critical
> factor.) The principle being that, when it comes to squeezing
> performance out of an Alpha CPU, the people who developed the Alpha
> architecture may well do a better job than the people who develop
> gcc.
> 
Any improvement on MacLucas is desired.

I think we can sqeeze FFTW a lot more. I like its code very much. The
good performance on intel (45% with respect mprime) is good enought to
work a litle more on it. 

Regards

| Guillermo Ballester Valor   |  
| [EMAIL PROTECTED]  |  
| c/ cordoba, 19  |
| 18151-Ogijares (Spain)  |
| (Linux registered user 1171811) |
_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: Online proof of Lucas-Lehmer

1999-10-05 Thread Joth Tupper

The Lucas-Lehmer test seems pretty magical to me and I have wanted to see a
full proof of the theorem for some time.

For a long time, there has been a proof online that M(p) divides the term in
the LL sequence means that M(p) is prime but
I had still never seen the other half.  So I worked out the rest of a proof.

If interested, please take a look at http://www.jt-actuary.com/lucas-le.htm.
I see one "typo" where the "less than" symbol (<) displays incorrectly
online.
There may be other typos and even errors.  (Well, I can hope not...)

If this seems readable, would anyone want it linked in to a FAQ?
(Not my call.  I just wanted a proof.)

Trivial note:  back in the summer of 1968, I was one of a bunch of high
school kids who met
DH Lehmer (son of the Lehmer of LL fame, also a UC Berkeley math professor)
in the basement of
one of the engineering buildings at Berkeley.  (It may have been a math
building then, but the location
is now an engineering building and the math buildings were in three other
spots even then.)

He showed us his "DLS-127" (Delay Line Sieve).  This was one of the best
prime-crunchers of the 1960's.
As I recall, it was an ANALOG computer based on very precise inductors.

Thanks,

Joth


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers