[sage-devel] Re: numerically stable fast univariate polynomial multiplication over RR[x]

Bill Hart Thu, 29 Apr 2010 09:46:06 -0700

I made the speedup I mentioned to the Fast Hartley Transform code in
flint2.


Again I pick on the benchmarks Robert gave:

> sage: f = (x+1)^1000
> sage: timeit("f*f")
> 5 loops, best of 3: 142 ms per loop
> sage: timeit("f._mul_karatsuba(f)")
> 5 loops, best of 3: 655 ms per loop

That time is down to 5.5 ms with the FHT code in FLINT.

(If we change that to f = (x+1)^10000 then the time to compute f*f is
now about 255 ms.)

Bill.

On Apr 28, 8:01 pm, Bill Hart <goodwillh...@googlemail.com> wrote:
> I coded up a basic classical multiplication (no attempt whatsoever to
> prevent cancellation). Fortunately the result of the classical and
> Hartley multiplication routines agree (up to some precision loss).
>
> I picked on the timings Robert gave:
>
> sage: f = (x+1)^1000
> sage: timeit("f*f")
> 5 loops, best of 3: 142 ms per loop
> sage: timeit("f._mul_karatsuba(f)")
> 5 loops, best of 3: 655 ms per loop
>
> So with the classical algorithm this multiplication takes about 16 ms.
> With the Fast Hartley Transform it currently takes 26 ms.
>
> If we change that to (1+x)^10000 then the times are 1400 ms and 610 ms
> respectively.
>
> The FHT should be much faster after I make an important speedup some
> time today or tomorrow. It should comfotably win in that first
> benchmark.
>
> Bill.
>
> On Apr 28, 6:01 pm, Bill Hart <goodwillh...@googlemail.com> wrote:
>
>
>
> > Rader-Brenner is less stable because it uses purely imaginary twiddle
> > factors whose absolute value is considerably bigger than 1. It has
> > been largely discarded in favour of the split-radix method which uses
> > less real multiplications and additions anyhow.
>
> > It's very interesting to see an implementation of it, because it is
> > not described in very many places!
>
> > Bill.
>
> > On Apr 28, 5:10 pm, Bill Hart <goodwillh...@googlemail.com> wrote:
>
> > > A very nice looking library, but it uses Karatsuba and Toom Cook, the
> > > very things we decided we unstable. I also read that the Rader-Brenner
> > > FFT is very unstable.
>
> > > Bill.
>
> > > On Apr 28, 2:14 pm, YannLC <yannlaiglecha...@gmail.com> wrote:
>
> > > > This might be of interest:
>
> > > > from the people of MPFR,
>
> > > > "Mpfrcx is a library for the arithmetic of univariate polynomials over
> > > > arbitrary precision real (Mpfr) or complex (Mpc) numbers, without
> > > > control on the rounding. For the time being, only the few functions
> > > > needed to implement the floating point approach to complex
> > > > multiplication are implemented. On the other hand, these comprise
> > > > asymptotically fast multiplication routines such as Toom–Cook and the
> > > > FFT."
>
> > > >http://www.multiprecision.org/index.php?prog=mpfrcx
>
> > > > On Apr 28, 3:53 am, Bill Hart <goodwillh...@googlemail.com> wrote:
>
> > > > > You can access the code here:
>
> > > > >http://selmer.warwick.ac.uk/gitweb/flint2.git?a=shortlog;h=refs/heads...
>
> > > > > You need the latest MPIR release candidate, the latest MPFR and either
> > > > > an x86 or x86_64 machine to run it.
>
> > > > > set the liib and include paths in the top level makefile, set the
> > > > > LD_LIBRARY_PATH's in flint_env, then do:
>
> > > > > source flint_env
> > > > > make library
> > > > > make check MOD=mpfr_poly
>
> > > > > Most of the relevant code is in the mpfr_poly directory.
>
> > > > > If it doesn't work for you, please understand this is development code
> > > > > only, therefore probably broken in many ways.
>
> > > > > Information on the Fast Hartley Transform can be found in Joerg
> > > > > Arndt's fxtbook:
>
> > > > >http://www.jjj.de/fxt/fxtbook.pdf
>
> > > > > Bill.
>
> > > > > On Apr 28, 2:45 am, Bill Hart <goodwillh...@googlemail.com> wrote:
>
> > > > > > Well I got the polynomial convolution working with the Fast Hartley
> > > > > > Transform. It seems to pass my primitive test code.
>
> > > > > > As an example of a first timing, 1000 iterations of multiplying
> > > > > > polynomials of length 512 with 100 floating point bits precision per
> > > > > > coefficient takes 17s.
>
> > > > > > I think I can make it take about half that time (at least), but that
> > > > > > is a job for tomorrow.
>
> > > > > > I haven't looked yet to see how much precision is lost 
> > > > > > heuristically.
>
> > > > > > Bill.
>
> > > > > > On Apr 27, 11:49 pm, Bill Hart <goodwillh...@googlemail.com> wrote:
>
> > > > > > > Well, I coded up a mini mpfr_poly module and Fast Hartley 
> > > > > > > Transform in
> > > > > > > flint2 using mpfr's as coefficients.
>
> > > > > > > It's hard to know if it is doing the right thing, there's no test 
> > > > > > > code
> > > > > > > yet. But it compiles and doesn't segfault. :-)
>
> > > > > > > A little bit more work required to turn it into a convolution, 
> > > > > > > but as
> > > > > > > far as I know, no inverse transform required, as it is its own
> > > > > > > inverse. Maybe by tomorrow I'll have a fast convolution and we'll 
> > > > > > > be
> > > > > > > able to answer some of these questions heuristically.
>
> > > > > > > It won't be super efficient, no truncation, no radix 4, no special
> > > > > > > case for double precision coefficients, no cache friendly matrix
> > > > > > > fourier stuff, etc, but I did reuse the twiddle factors as much as
> > > > > > > possible, I think.
>
> > > > > > > Bill.
>
> > > > > > > On Apr 27, 5:18 pm, Bill Hart <goodwillh...@googlemail.com> wrote:
>
> > > > > > > > Also see pages 252 and following of Polynomial and Matrix
> > > > > > > > Computations, Volume 1 by Dario Bini and Victor Pan, which 
> > > > > > > > seems to
> > > > > > > > answer your question in detail.
>
> > > > > > > > Bill.
>
> > > > > > > > On Apr 27, 4:57 pm, Bill Hart <goodwillh...@googlemail.com> 
> > > > > > > > wrote:
>
> > > > > > > > > Numerical stability is not something I have any experience 
> > > > > > > > > with. I am
> > > > > > > > > not sure if is is equivalent in some sense to the loss of 
> > > > > > > > > precision
> > > > > > > > > which occurs when doing FFT arithmetic using a floating point 
> > > > > > > > > FFT.
>
> > > > > > > > > The issue seems to be accumulation of "numerical noise". 
> > > > > > > > > There are
> > > > > > > > > proven bounds on how many bits are lost to this as the 
> > > > > > > > > computation
> > > > > > > > > proceeds, but the general consensus has been that for "random
> > > > > > > > > coefficients" the numerical noise is much, much less than the 
> > > > > > > > > proven
> > > > > > > > > bounds would indicate.
>
> > > > > > > > > Zimmermann, Gaudry, Kruppa suggest that the following paper 
> > > > > > > > > contains
> > > > > > > > > information about proven bounds for the floating point FFT:
>
> > > > > > > > > Percival, C. Rapid multiplication modulo the sum
> > > > > > > > > and difference of highly composite numbers. Math.
> > > > > > > > > Comp. 72, 241 (2003), 387–395
>
> > > > > > > > > Paul Zimmermann would almost certainly know what the current 
> > > > > > > > > State of
> > > > > > > > > the Art is.
>
> > > > > > > > > Similar stability issues arise in matrix arithmetic with 
> > > > > > > > > floating
> > > > > > > > > point coefficient. It seems that one can always prove 
> > > > > > > > > something when
> > > > > > > > > you know something about the successive minima. But I am not 
> > > > > > > > > too sure
> > > > > > > > > what the equivalent for an FFT would be. You might be into 
> > > > > > > > > the area of
> > > > > > > > > research rather than something that is known, i.e. it might 
> > > > > > > > > just be
> > > > > > > > > conventional wisdom that the FFT behaves well.
>
> > > > > > > > > The people to ask about this are definitely Joris van der 
> > > > > > > > > Hoeven and
> > > > > > > > > Andreas Enge. If you find a very nice algorithm, let me know 
> > > > > > > > > and we'll
> > > > > > > > > implement it in flint2, which should have a type for 
> > > > > > > > > polynomials with
> > > > > > > > > floating point coefficients (using MPFR).
>
> > > > > > > > > Bill.
>
> > > > > > > > > On Apr 27, 2:45 pm, Jason Grout <jason-s...@creativetrax.com> 
> > > > > > > > > wrote:
>
> > > > > > > > > > On 04/26/2010 10:54 PM, Robert Bradshaw wrote:
>
> > > > > > > > > > > I should comment on this, as I wrote the code and 
> > > > > > > > > > > comments in question.
> > > > > > > > > > > There actually is a fair amount of research out there on 
> > > > > > > > > > > stable
> > > > > > > > > > > multiplication of polynomials over the real numbers, but 
> > > > > > > > > > > (if I remember
> > > > > > > > > > > right, it was a while ago) there were some results to the 
> > > > > > > > > > > effect that no
> > > > > > > > > > > asymptotically fast algorithm has good stability 
> > > > > > > > > > > properties over all
> > > > > > > > > > > possible coefficients.
>
> > > > > > > > > > If you have a moment and if you remember, could you point 
> > > > > > > > > > out a
> > > > > > > > > > reference or two?  I searched for several hours before 
> > > > > > > > > > posting, and
> > > > > > > > > > couldn't find anything that seemed to address the question 
> > > > > > > > > > of stability
> > > > > > > > > > satisfactorily.
>
> > > > > > > > > > > Of course, in practice one often runs into "well
> > > > > > > > > > > behavied" polynomials in R[x]. (In particular, most of 
> > > > > > > > > > > the interest is,
> > > > > > > > > > > understandably, about truncated power series, though 
> > > > > > > > > > > there are other
> > > > > > > > > > > uses.) The obvious thing to do is a (non-discrete) FFT. 
> > > > > > > > > > > What I was
> > > > > > > > > > > thinking about implementing (though I never got back to 
> > > > > > > > > > > it) is something
> > > > > > > > > > > that works like this:
>
> > > > > > > > > > > 1) Choose some a and rescale f(x) -> f(alpha*x) so all 
> > > > > > > > > > > coefficients have
> > > > > > > > > > > roughly the same size.
> > > > > > > > > > > 2) Multiply the resulting polynomial over Z using the 
> > > > > > > > > > > insanely fast code
> > > > > > > > > > > in FLINT. The resulting answer will be exact product of 
> > > > > > > > > > > the truncated
> > > > > > > > > > > input, and the precision loss (conversely, required 
> > > > > > > > > > > precision for no
> > > > > > > > > > > loss) was fairly easy to work out if I remember right .
> > > > > > > > > > > 3) Cast the result back into R[x].
>
> > > > > > > > > > Thanks!
>
> > > > > > > > > > Jason
>
> > > > > > > > > > --
> > > > > > > > > > To post to this group, send an email to 
> > > > > > > > > > sage-devel@googlegroups.com
> > > > > > > > > > To unsubscribe from this group, send an email to 
> > > > > > > > > > sage-devel+unsubscr...@googlegroups.com
> > > > > > > > > > For more options, visit this group 
> > > > > > > > > > athttp://groups.google.com/group/sage-devel
> > > > > > > > > > URL:http://www.sagemath.org
>
> > > > > > > > > --
> > > > > > > > > To post to this group, send an email to 
> > > > > > > > > sage-devel@googlegroups.com
> > > > > > > > > To unsubscribe from this group, send an email to 
> > > > > > > > > sage-devel+unsubscr...@googlegroups.com
> > > > > > > > > For more options, visit this group 
> > > > > > > > > athttp://groups.google.com/group/sage-devel
> > > > > > > > > URL:http://www.sagemath.org
>
> > > > > > > > --
> > > > > > > > To post to this group, send an email to 
> > > > > > > > sage-devel@googlegroups.com
> > > > > > > > To unsubscribe from this group, send an email to
>
> ...
>
> read more »

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

[sage-devel] Re: numerically stable fast univariate polynomial multiplication over RR[x]

Reply via email to