Mersenne Digest V1 #619

Mersenne Digest Sat, 28 Aug 1999 15:06:25 -0700

Mersenne Digest       Saturday, August 28 1999       Volume 01 : Number 619




----------------------------------------------------------------------

Date: Tue, 24 Aug 1999 22:20:50 EDT
From: [EMAIL PROTECTED]
Subject: Mersenne: targeting multiply/add (Was: K7 vs. x86)

Hi, Jason:

>Or, if you just want to cheapen a complex multiply, turn the standard
>form 
>
>   xr,xi * yr,ri ->   xr*yr-xi*yi, xr*yi+xi*yr
>
>into
>
>   xr,xi * yr,ri ->   xr*(yr - xi/xr * yi), xr*(yi + xi/xr * yr)
>
>and precompute xr and xi/xr. Presto! 2 multiplies and 2 multiply-adds.

An excellent suggestion - why rely on the compiler perhaps being smarter
than it really is, when you drop a broad hint with very little modification
of the normal code?

I tried the above in just the forward radix-16 FFT loop of my code, replacing
sines with tangents in the precomputation of the FFT sincos data. No timing
change on the MIPS R10K, indicating that the MIPSPro f90 compiler was likely
already doing such a replacement for me. But on the Alpha 21064 (which has
no MADD instruction) my times for large FFT lengths dropped about 5%!
(I expect another 5% when I modify the inverse FFT similarly).
Weird, but welcome. Any ideas how the above replacement might improve
pipelineability of a twiddle-multiply/add/subtract sequence, assuming just
FMUL and FADD are available?

Onward and Upward,
- -Ernst

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Wed, 25 Aug 1999 05:06:11 +0200 (MET DST)
From: [EMAIL PROTECTED]
Subject: Re:  Mersenne: targeting multiply/add (Was: K7 vs. x86)

Ernst Mayer writes:

- - Hi, Jason:
- - 
- - >Or, if you just want to cheapen a complex multiply, turn the standard
- - >form 
- - >
- - >   xr,xi * yr,ri ->   xr*yr-xi*yi, xr*yi+xi*yr
- - >
- - >into
- - >
- - >   xr,xi * yr,ri ->   xr*(yr - xi/xr * yi), xr*(yi + xi/xr * yr)
- - >
- - >and precompute xr and xi/xr. Presto! 2 multiplies and 2 multiply-adds.
- - 
- - An excellent suggestion - why rely on the compiler perhaps being smarter
- - than it really is, when you drop a broad hint with very little modification
- - of the normal code?
- - 
- - I tried the above in just the forward radix-16 FFT loop of my code, replacing
- - sines with tangents in the precomputation of the FFT sincos data. No timing
- - change on the MIPS R10K, indicating that the MIPSPro f90 compiler was likely
- - already doing such a replacement for me. But on the Alpha 21064 (which has
- - no MADD instruction) my times for large FFT lengths dropped about 5%!
- - (I expect another 5% when I modify the inverse FFT similarly).
- - Weird, but welcome. Any ideas how the above replacement might improve
- - pipelineability of a twiddle-multiply/add/subtract sequence, assuming just
- - FMUL and FADD are available?


    Compared to the straightforward coding, Jason's formula
uses the same instruction count but in a different order.

    Assume yr, yi, and xr, xi (or xr, xi/xr) are in registers, 
and we want the result to go into registers (zr, zi).
On a machine with multiply and multiply-(add/subtract/reverse subtract)
available, the translations might be

        Standard                      Jason
        temp1 = xi*yi                 temp1 = yr - (xi/xr)*yi
        temp2 = xi*yr                 temp2 = yi + (xi/xr)*yr
        zr = xr*xr - temp1            zr = xr*temp1
        zi = xr*yi + temp2            zi = xr*temp2


    Jason's method needs fewer temporaries on a machine with separate
add and multiply instructions.  The same register 
might be used for (xi/xr)*yi, temp1, and zr.  
The standard method needs separate temporaries for xr*yr and xi*yi.

    Jason's method lets the add instructions start earlier, during the
twiddle computation -- the standard method saturates the multipliers.
Jason's method may let the multiplies needed for zr and zi be combined 
with additions later in the code, on architectures with multiply-add.

    When (xr, xi) denotes   cos(theta) + i*sin(theta), 
Jason's method needs tables with cos(theta) and tan(theta).
The standard method lets one use sin(theta) = cos(pi/2 - theta)
to reduce table space for trigonometric functions.

    Jason's method needs special casing if xr might be zero, in which
case (xr, xi) is pure imaginary.

    I am not sure which method has more potential round-off error.

        Peter Montgomery
        [EMAIL PROTECTED]


_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Wed, 25 Aug 1999 09:53:24 +0200
From: Laurent Desnogues <[EMAIL PROTECTED]>
Subject: Mersenne: Regarding:  multiply/add

Jason Stratos Papadopoulos wrote:
> 
> There was a paper in (I believe) Appl. Math and Comp. from about two
> years ago that reformulated radix-2,3,4 and 5 FFT butterfiles to use
> multiply-adds wherever possible. The savings can be impressive: a radix-2
> FFT butterfly has 4 multiplies and six adds, but this boils down to
> six multiply-adds. Can't the RS6000 do two multiply-adds per clock?

   I have found a paper titled "Implementation of Efficient FFT Algorithms
on Fused Multiply-Add Architectures" by E. Linzer and E. Feig, IEEE Trans.
on Signal Processing, vol. 41, no 1, Jan 93.  They call their method
scaled FFT.

   For a radix-8, with size 8, Cooley-Tukey FFT has 56 m/a ops and their
method has 52 m/a ops.  The number of m/a ops for radix 8 is:

        - Cooley-Tukey:   7/2 n log_2(n) - 57/14 n + 32/7
        - scaled FFT:    11/4 n log_2(n) - 57/28 n + 16/7.

   Regarding the errors (root-mean-square error per point):
                              DFT        inverse DFT
        - Cooley-Tukey:  5.3500 10^-15   3.3999 10^-15
        - scaled FFT:    5.3441 10^-15   3.2386 10^-15

They say the impovement in error is probably due to the fast that a m/a
op is more precise than a mult followed by an add.

   Their experiments were done on an RS/6000 model 530.

   I can't comment on that article, since I'm not mathematically-gifted
enough to just read it without a lot of time with a pencil!


                Laurent
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Wed, 25 Aug 1999 21:14:23 EDT
From: [EMAIL PROTECTED]
Subject: Mersenne: Compaq dumps NT-on-Alpha

Check out

www.cnn.com/tech/computing/9908/24/nt.dump.idg/index.html

I wonder whether this tells us more about WinNT, the Alpha platform's future,
or Compaq's financial state?

- -Ernst
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Wed, 25 Aug 1999 22:40:55 -0400 (EDT)
From: Chip Lynch <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Compaq dumps NT-on-Alpha

This confuses me... slashdot carried this story in the last 48 hours (at
least I think it was them; I can't find the link now), but I thought it
was very clear that CompaQ was dropping only the 32-Bit Windows dvelopment
from Alpha, and KEEPING 64-Bit.  The CNN story definitely leads you to
believe that ALL NT support is being dropped.

As much as I applaud forcing Microsoft to evolve by threatening them with
rival operating systems, I can't imagine Compaq could completely drop NT
support from the Alpha and not lost major revenue.  Anyone closer to this
know the story?

- ---Chip

On Wed, 25 Aug 1999 [EMAIL PROTECTED] wrote:

> Check out
> 
> www.cnn.com/tech/computing/9908/24/nt.dump.idg/index.html
> 
> I wonder whether this tells us more about WinNT, the Alpha platform's future,
> or Compaq's financial state?
> 
> -Ernst
> _________________________________________________________________
> Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
> Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
> 

       \\ ^ //
        (o o)
 ---oOO--(_)--OOo------------------------------------
| Chip Lynch            |   Computer Guru            |
| [EMAIL PROTECTED]       |                            | 
| (703) 465-4176   (w)  |   (202) 362-7978   (h)     |
 ----------------------------------------------------

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Wed, 25 Aug 1999 20:13:08 -0700
From: "John R Pierce" <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Compaq dumps NT-on-Alpha

> This confuses me... slashdot carried this story in the last 48 hours (at
> least I think it was them; I can't find the link now), but I thought it
> was very clear that CompaQ was dropping only the 32-Bit Windows dvelopment
> from Alpha, and KEEPING 64-Bit.  The CNN story definitely leads you to
> believe that ALL NT support is being dropped.

That was the spin of the various stories a few days ago.  And yes, news.com
has a similar story today saying its apparently ALL nt/alpha development.

> As much as I applaud forcing Microsoft to evolve by threatening them with
> rival operating systems, I can't imagine Compaq could completely drop NT
> support from the Alpha and not lost major revenue.  Anyone closer to this
> know the story?

It appears Compaq is bleeding red ink, and Alpha/NT is not very profitable
for them short term.  Further, in effect, by putting Alpha 64 bit
development resources, they are really helping out Intel's IA-64 Merced by
insuring there will be a 64 bit NT/Win2000 ready when Merced actually rolls
out.

- -jrp

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Thu, 26 Aug 1999 08:21:34 -0600
From: "Aaron Blosser" <[EMAIL PROTECTED]>
Subject: RE: Mersenne: Compaq dumps NT-on-Alpha

> > As much as I applaud forcing Microsoft to evolve by threatening
> them with
> > rival operating systems, I can't imagine Compaq could completely drop NT
> > support from the Alpha and not lost major revenue.  Anyone
> closer to this
> > know the story?
>
> It appears Compaq is bleeding red ink, and Alpha/NT is not very profitable
> for them short term.  Further, in effect, by putting Alpha 64 bit
> development resources, they are really helping out Intel's IA-64 Merced by
> insuring there will be a 64 bit NT/Win2000 ready when Merced
> actually rolls
> out.

This has me boggled as well.  At the Compaq conference a few weeks ago (at
which Capellas, the new CEO, gave the opening remarks), there was no
indication of this whatsoever.  The Alphas they had setup were running Tru64
for the most part, but they did have a couple Alpha systems running Win2K,
showing those off.

Hmm...  Once I get my CD covering the conference events, I'll have to go
over his opening remarks and look for clues that this would happen.  I do
know that Capellas intended to shake things up at Compaq, but this seems so
bizarre.

Aaron

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Thu, 26 Aug 1999 10:25:53 -0500
From: Brian Pinto <[EMAIL PROTECTED]>
Subject: RE: Mersenne: Compaq dumps NT-on-Alpha

This is the link that was posted on slashdot:
http://www.zdnet.com/pcweek/stories/news/0,4153,2318096,00.html


Slashdot's story was "Ixnay WinNT on Alpha" and can be found here:
(http://slashdot.org/article.pl?sid=99/08/20/204212&mode=thread)

> -----Original Message-----
> From: John R Pierce [SMTP:[EMAIL PROTECTED]]
> Sent: Wednesday, August 25, 1999 10:13 PM
> To:   Mersenne discussion list
> Subject:      Re: Mersenne: Compaq dumps NT-on-Alpha
> 
> > This confuses me... slashdot carried this story in the last 48 hours
> (at
> > least I think it was them; I can't find the link now), but I thought
> it
> > was very clear that CompaQ was dropping only the 32-Bit Windows
> dvelopment
> > from Alpha, and KEEPING 64-Bit.  The CNN story definitely leads you
> to
> > believe that ALL NT support is being dropped.
> 
> That was the spin of the various stories a few days ago.  And yes,
> news.com
> has a similar story today saying its apparently ALL nt/alpha
> development.
> 
> > As much as I applaud forcing Microsoft to evolve by threatening them
> with
> > rival operating systems, I can't imagine Compaq could completely
> drop NT
> > support from the Alpha and not lost major revenue.  Anyone closer to
> this
> > know the story?
> 
> It appears Compaq is bleeding red ink, and Alpha/NT is not very
> profitable
> for them short term.  Further, in effect, by putting Alpha 64 bit
> development resources, they are really helping out Intel's IA-64
> Merced by
> insuring there will be a 64 bit NT/Win2000 ready when Merced actually
> rolls
> out.
> 
> -jrp
> 
> _________________________________________________________________
> Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
> Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 28 Aug 1999 18:24:42 EDT
From: [EMAIL PROTECTED]
Subject: Mersenne: How about some C code?

Ok, I've cleaned up the Mlucas 2.6 source code and I think we're
ready to take a crack at a C version. The revised source (which
also includes a fix for a bug which David Willmore brought to
my attention - don't worry, it can only cause the code to quit
unexpectedly when it starts a new exponent as part of a multi-exponent
range test - any actual test results should be fine) is at

ftp://209.133.33.182/pub/mayer/Mlucas_2.6c.f90.gz

and Alpha and MIPS executables are also available in the .../pub/mayer/bin
directory at the above site. 2.6c also has some slightly improved memory
management - it's 5-10% faster on my Alpha 21064 and 21164 than 2.6b was.

With v2.5, Gord Palameta was
able (using an automated translator and subsequent manual work) to
get a decent C version in a remarkably short time; Brian Beuning
also created one, but we never did any head-to-head comparisons of
the two resulting codes.

This time, I'd like to keep some coherency to any such effort, so once
I see who offers to help, we can discuss how distribute the labor. Of course
anyone is free to go it alone, but I will ask Conrad Curry, who maintains
the GIMPS "other available source code" page to only post a link to a C
version if it gets my stamp of approval.

Potential benefits of an Mlucas.c:

1) Will run on any Unix platform; no special compiler should be needed
  (though we need to find the best C compiler for each platform);

2) Will make it easier to interface with the eventual PrimeNet for Unix;

3) Will make it easier to interface with Peter Montgomery's Mfactor code;

4) Will allow us to try neat tricks like inlining cache-bypassing stores
   (if the architecture supports such an operation) which might really
   help speed the code on systems with small caches.

With that said, are there any volunteers? I'll of course be around to help
explain what-this-and-that in the F90 code means, but will be putting most
of my near-term effort into a P-1 factorer geared for large exponents.

Cheers,
Ernst

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

End of Mersenne Digest V1 #619
******************************
Mersenne Digest V1 #619

Reply via email to