Hi,
Does anyone have experience with implementing rational function approximations
to a given special function of one variable? This would be extremely
useful addition
to sympy. Here is an example for the error function from the standard
gfortran library:
https://github.com/mirrors/gcc/blob/master/libgfortran/intrinsics/erfc_scaled_inc.c
What happens is that whenever you call error function in a Fortran program, this
function will get called if you use gfortran. So it needs to be
accurate (in double
precision) and very fast.
If you look at the implementation, they split the real axis on several
intervals:
[0, 0.46875] 4 terms
(0.46875, 4] 8 terms
(4, oo) 5 terms
And in each interval they use a rational function approximation that
is guaranteed
to provide at least 18 significant decimal digits. I've indicated the
number of terms
for each interval above.
So you cannot get more accurate than that in double precision. In
terms of speed,
this is pretty much impossible to beat.
What I actually need is a similar approximation for modified Bessel functions
of half integer order. I've been learning the methods, originally I
thought I would
just implement a general hypergeometric function (of which the Bessel ones
are just a simple special case) and Fredrik has been super helpful with this,
as he implemented general solvers in mpmath for arbitrary precision. Mpmath
works great, but it's slow. I've implemented similar hypergeometric function
in Fortran for double precision, and it's still about 10x slower than my
series approximation from sympy (directly copy&pasted to Fortran).
The challenge is to choose intervals on which it works, and I've been
checking the accuracy by hand so far.
I really need this to be fast, so I realized
that the only way to nail this down once and for good is to use similar tricks
as the error function above.
The interface to sympy that I am imagining would be to give sympy a formula
(later maybe even just some numerical function for cases where there
is no simple
formula). For example the difficult part of I_{9/2}(x) is:
In [1]: r = (105/x**4 + 45/x**2 + 1)*sinh(x) - (105/x**3 + 10/x)*cosh(x)
this is a simple exact formula (I am actually lucky, that such a formula exists,
typically I only have a general hypergeometric series, that needs to
be summed up,
like the error function). However, even this formula *cannot* be used
for low "x", for example:
In [2]: r.subs(x, S(1)/10).n()
Out[2]: 1.05868215119243e-8
In [3]: r.subs(x, S(1)/10.)
Out[3]: 1.05937942862511e-8
Here [2] is the correct answer (using adaptive evaluation that Fredrik
implemented using mpmath),
while [3] is simply evaluating the formula using floating point
(similar to what Fortran does).
As you can see, from about 15 digits, the result [3] got only 3 digits
right due to numerical
cancellations. So that's unusable.
The solution that I implemented in my program for now is this:
In [4]: s = r.series(x, 0, 15).removeO()
In [5]: s
Out[5]: x**13/13232419200 + x**11/97297200 + x**9/1081080 + x**7/20790
+ x**5/945
In [6]: s.subs(x, S(1)/10).n()
Out[6]: 1.05868215119243e-8
In [7]: s.subs(x, S(1)/10.)
Out[7]: 1.05868215119243e-8
The [6] and [7] agrees to all significant digits, which just means
that the actual series can be summed
up using floating point accurately. Finally, the agreement of [6] with
[2] means that this series
gives accurate results (to all significant digits) with the exact answer.
So we know that for x=0.1, we can use this series. By experimenting
with this formula, I found out, that
for x > 4, the formula [1] gives exact answer using double precision.
In [8]: r.subs(x, S(4)).n()
Out[8]: 2.16278782780322
In [9]: r.subs(x, 4.)
Out[9]: 2.16278782780323
That's good enough. For lower than 4 it is less accurate, for example:
In [10]: r.subs(x, 1.)
Out[10]: 0.00110723646096744
In [11]: r.subs(x, S(1)).n()
Out[11]: 0.00110723646098546
Finally, the series [5] seems accurate up to x = 0.4
In [12]: r.subs(x, S(4)/10).n()
Out[12]: 1.09150288698177e-5
In [13]: s.subs(x, S(4)/10).n()
Out[13]: 1.09150288698173e-5
So after this painful analysis, we have found:
[0, 0.4] use [5]
[4, oo] use [1]
Now we can expand the function around x=1, and repeat the analysis
above until we cover the whole real axis. So first of all, this should
be automated.
But then the series expansion is *not* the best approximation, because
the series is very (too much) accurate around x=0, and barely accurate
around x=0.4.
A better approximation is to use a so called Chebyshev approximation,
which gives uniform accuracy (the end result is that less terms are
needed).
Finally, even better than just using one series, it's better to use a
rational function, which is a fraction of two polynomials. This seems
to be the most effective.
I found some algorithm in Numerical Recipes:
http://www.mpi-hd.mpg.de/astrophysik/HEA/internal/Numerical_Recipes/f5-13.pdf
it only needs a numerical function as input, which is the best.
Does