[Jprogramming] Inaccuracy of repeated squaring WAS: Convolution using FFT

Henry Rich Fri, 31 Dec 2010 13:52:31 -0800

The reduced accuracy seems to stem from the change to repeated squaring 
in J701:


    9!:14''
j701/beta/2010-11-24/22:45
    a=.r.1p1%32
    3!:3 a
e1000000
10000000
01000000
00000000
2625d1a3
8dd8ef3f
2cb429bc
a617b93f
    3!:3 a^31
e1000000
10000000
01000000
00000000
3425d1a3
8dd8efbf
4eb429bc
a617b93f
    3!:3 ^31*^.a
e1000000
10000000
01000000
00000000
2525d1a3
8dd8efbf
3cb429bc
a617b93f

That much error on a small power suggests that repeated squaring might 
not be such a good idea.  Also, it would produce the situation where

{. +. a^n

is not monotone increasing when n is monotone increasing, which might 
lead to some subtle bugs in applications.

Henry Rich

On 12/31/2010 2:52 PM, Henry Rich wrote:
> Well, this is interesting.  I get different results depending on J
> version, and the J7 version is quite a bit less accurate:
>
>      9!:14 ''
> j701/beta/2010-11-24/22:45
>
>      3!:3 a =. r.1p1%32
> e1000000
> 10000000
> 01000000
> 00000000
> 2625d1a3
> 8dd8ef3f
> 2cb429bc
> a617b93f
>      5!:5<'a'
> 0.99518472667219693j0.098017140329560604
>      3!:3 b =. a^31
> e1000000
> 10000000
> 01000000
> 00000000
> 3425d1a3
> 8dd8efbf
> 4eb429bc
> a617b93f
>      5!:5<'b'
> _0.99518472667219848j0.098017140329561075
>
> The difference in the imaginary parts is the difference between 4e and
> 2c on the next-last line.
>
> In the previous version, it was 3c not 4e:
>
>      9!:14 ''
> j602/2008-03-03/16:45
>      3!:3 a =. r.1p1%32
> e1000000
> 10000000
> 01000000
> 00000000
> 2625d1a3
> 8dd8ef3f
> 2cb429bc
> a617b93f
>      3!:3 b =. a^31
> e1000000
> 10000000
> 01000000
> 00000000
> 2525d1a3
> 8dd8efbf
> 3cb429bc
> a617b93f
>       5!:5<'b'
> _0.99518472667219682j0.098017140329560826
>
> Now 3c vs 2c, much closer.  Perhaps something in a math library changed
> for the worse?
>
> It would be possible to polish the values of _1 ^ i % 2 * N by using
> Newton's method on the roots individually.  You should be able to get to
> within 1 or 2 ULPs of the correct value, I would think.  It would be
> interesting to see how much this reduces the error in the FFT.  I'm not
> going to look into this, but I hope someone will.
>
> For integer-multiplication purposes, the unpolished roots are adequate.
>
> Henry Rich
>
>
>
> On 12/30/2010 10:41 PM, Marshall Lochbaum wrote:
>> Okay, so this is not the method roots uses. However, using this method
>> returns the same results:
>>
>>      a=.r.1p1%32
>>      5!:5<'a'
>> 0.99518472667219693j0.098017140329560604
>>      b=.a^31
>>      5!:5<'b'
>> _0.99518472667219682j0.098017140329560826
>>      a-b
>> 1.99037j_2.22045e_16
>>
>> Marshall
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Henry Rich
>> Sent: Thursday, December 30, 2010 4:24 PM
>> To: Programming forum
>> Subject: Re: [Jprogramming] Convolution using FFT
>>
>> 1p1 * 31%32
>>
>> takes 5 more high-order bits than
>>
>> 1p1 * 1%32
>>
>> so the absolute error in representation is that much greater, and when you
>> take the sin, you should expect to lose at least that many low-order bits.
>> That seems to be approximately the difference you see here.
>>
>> IEEE floating point, with 53 bits of fraction, is at best accurate to within
>> 1 part in 1e16, so we are getting almost all the accuracy allowed by the
>> precision of the number system.
>>
>> Henry Rich
>>
>>
>> On 12/30/2010 3:35 PM, Marshall Lochbaum wrote:
>>> This looks to be correct. Take, for example, a calculation of
>>> sin(pi/32) and sin(31*pi/32).
>>>
>>>       a=.1 o. 1p1*1 31%32
>>>       5!:5<'a'
>>> 0.098017140329560604 0.098017140329560826
>>>       |@-/a
>>> 2.22045e_16
>>>
>>> Mathematica gives the exact value of the sin to be
>>>
>>> 0.098017140329560602
>>>
>>> so the first value is quite close and the second is farther off.
>>>
>>> Can anyone post timings on polynomial multiplication using fftw? It
>>> would seem that a package like this would be the best option for
>>> implementing convolution in extended integers.
>>>
>>> Marshall
>>>
>>> -----Original Message-----
>>> From: [email protected]
>>> [mailto:[email protected]] On Behalf Of Henry Rich
>>> Sent: Thursday, December 30, 2010 1:07 PM
>>> To: Programming forum
>>> Subject: Re: [Jprogramming] Convolution using FFT
>>>
>>> It seems to me that if you compute a =. _1 ^ 2 % N, and then take that
>>> value and raise it to a high power, you are going to propagate and
>>> multiply any original roundoff error of computing a.  Whereas, if you
>>> compute _1 ^ (n % N), you are going to get a more accurate result
>>> since n % N is exact (N is a power of 2).
>>>
>>> I admit that numerical analysis is not my best subject.
>>>
>>> I reasoned that by computing 1/8 or a revolution and using symmetry to
>>> produce two reflection of it, it would get a slightly more accurate
>>> result for the part it calculated, and then using symmetry would
>>> absolutely remove any bias in the coefficients.  The results seem to
>>> bear this out, in that the error terms after fft/ifft are smaller.
>>>
>>> I don't think this is an error in the J implementation, just an
>>> artifact of floating-point arithmetic.
>>>
>>> Henry Rich
>>>
>>> On 12/30/2010 12:35 PM, Bo Jacoby wrote:
>>>> http://www.jsoftware.com/jwiki/Essays/FFT says that a long version of
>>>> the
>>> verb  roots  has better accuracy for N>=8 .
>>>>
>>>> Has   _1&^  for computing complex-valued fractional powers of _1 really
>>> low accuracy?
>>>>
>>>> If that is the case it should be fixed in the implementation of  ^
>>>> rather
>>> than in a work-around, I think.
>>>>
>>>> Happy new year!
>>>>
>>>>
>>>> --- Den tors 30/12/10 skrev Roger Hui<[email protected]>:
>>>>
>>>>> Fra: Roger Hui<[email protected]>
>>>>> Emne: Re: [Jprogramming] Convolution using FFT WAS: Proposal for
>>>>> special code voor dyadic f/.@:(g/)
>>>>> Til: "Programming forum"<[email protected]>
>>>>> Dato: torsdag 30. december 2010 18.05 I have incorporated these
>>>>> further comments into the wiki page.
>>>>>
>>>>> The J system currently does not yet use FFT to multiply big ints.
>>>>> It should, and your code and comments will be helpful in that speed-up.
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>> From: Henry Rich<[email protected]>
>>>>> Date: Thursday, December 30, 2010 7:32
>>>>> Subject: Re: [Jprogramming] Convolution using FFT WAS:
>>>>> Proposal for special code voor dyadic f/.@:(g/)
>>>>> To: Programming forum<[email protected]>
>>>>>
>>>>>> The code I posted on the Wiki is for understanding,
>>>>> not for
>>>>>> performance
>>>>>> measurement.  Several improvements are possible:
>>>>>>
>>>>>> 0. The code recomputes roots of _1 (4 times!) every
>>>>> time it
>>>>>> runs; a
>>>>>> serious implementation would reuse the roots 1. The final pass,
>>>>>> which produces an imaginary-only
>>>>> result,
>>>>>> would be
>>>>>> replaced by a step to halve the size of the final FFT,
>>>>> which
>>>>>> would save
>>>>>> 25% overall
>>>>>> 2. You would just use the fftw library anyway, which
>>>>> is
>>>>>> optimized
>>>>>> already and is a few times faster
>>>>>>
>>>>>> The important point, to me, is that this in an O(*^.)
>>>>>
>>>>>> algorithm.  It can
>>>>>> feasibly multiply polynomials of a million terms, or
>>>>> numbers
>>>>>> with
>>>>>> millions of digits.
>>>>>>
>>>>>> I wanted to see whether it would work as well as the
>>>>> literature
>>>>>> suggests, and I think the answer is Yes.  I also
>>>>> wondered
>>>>>> whether Roger
>>>>>> should use something like that for multiplication of
>>>>> extended
>>>>>> integers,
>>>>>> and I think the answer is Yes to that too, though for
>>>>> all I know
>>>>>> he's
>>>>>> already doing it.
>>>>>>
>>>>>> I haven't looked at your obta.
>>>>>>
>>>>>> I question your use of tm*sz as a figure of merit.
>>>>> If I am
>>>>>> willing to
>>>>>> trade space for time I am happy to take 3x the space
>>>>> in 0.5x the
>>>>>> time.
>>>>>> That is especially true here, where any operands that
>>>>> are big
>>>>>> enough to
>>>>>> cause a memory problem would take days for the
>>>>> computation.
>>>>>>
>>>>>> Henry Rich
>>>>>>
>>>>>> On 12/30/2010 1:58 AM, R.E. Boss wrote:
>>>>>>>> -----Oorspronkelijk bericht-----
>>>>>>>> Van: [email protected]
>>>>> [mailto:programming-
>>>>>>>> [email protected]]
>>>>> Namens Henry Rich
>>>>>>>> Verzonden: dinsdag 28 december 2010 18:42
>>>>>>>> Aan: Programming forum
>>>>>>>> Onderwerp: Re: [Jprogramming] Convolution
>>>>> using FFT WAS:
>>>>>> Proposal for
>>>>>>>> special code voor dyadic f/.@:(g/)
>>>>>>>>
>>>>>>>> OK, all these suggestions are now in
>>>>>>>>
>>>>>>>> http://www.jsoftware.com/jwiki/Essays/FFT
>>>>>>>>
>>>>>>>> Henry Rich
>>>>>>>>
>>>>>>>
>>>>>>> Some observations.
>>>>>>>
>>>>>>>          'X Y'=. 8192 8193<@(?...@$)"(0) 1000
>>>>>>>          rank'iconvolve~ X';'iconvolve~ Y'
>>>>>>> +----+-----+----+----+
>>>>>>> |rank|tm*sz|time|size|
>>>>>>> +----+-----+----+----+
>>>>>>> | 0  |1.00 |1.00|1.00|
>>>>>>> +----+-----+----+----+
>>>>>>> | 1  |5.22 |2.61|2.00|
>>>>>>> +----+-----+----+----+
>>>>>>>          NB. jump in performance at
>>>>> 2^n
>>>>>>>
>>>>>>>          'X Y' =. 2 400 ?...@$ 1000x
>>>>>>>          datatype&>      X;Y;(X +/ obta * Y);X
>>>>>
>>>>>> iconvolve Y
>>>>>>> extended
>>>>>>> extended
>>>>>>> extended
>>>>>>> integer
>>>>>>>          NB. no extended output
>>>>>>>
>>>>>>>          'X Y' =. 2 8192 ?...@$ 1000
>>>>>>>          rank'X +/ obta * Y';'X iconvolve Y'
>>>>>>> +----+-----+----+----+
>>>>>>> |rank|tm*sz|time|size|
>>>>>>> +----+-----+----+----+
>>>>>>> | 0  |1.00 |6.49|1.00|
>>>>>>> +----+-----+----+----+
>>>>>>> | 1  |1.36 |1.00|8.83|
>>>>>>> +----+-----+----+----+
>>>>>>>          NB. fast, not lean
>>>>>>>
>>>>>>>          'X Y' =. 2 8193 ?...@$ 1000
>>>>>>>          rank'X +/ obta * Y';'X iconvolve Y'
>>>>>>> +----+-----+----+-----+
>>>>>>> |rank|tm*sz|time|size |
>>>>>>> +----+-----+----+-----+
>>>>>>> | 0  |1.00 |2.69| 1.00|
>>>>>>> +----+-----+----+-----+
>>>>>>> | 1  |6.56 |1.00|17.65|
>>>>>>> +----+-----+----+-----+
>>>>>>>          NB. overall less efficient
>>>>>>>
>>>>>>>          'X Y' =. 8000 9000<@(?...@$)"(0) 1000
>>>>>>>          rank'X +/ obta * Y';'X iconvolve Y'
>>>>>>> +----+-----+----+-----+
>>>>>>> |rank|tm*sz|time|size |
>>>>>>> +----+-----+----+-----+
>>>>>>> | 0  | 1.00|2.58| 1.00|
>>>>>>> +----+-----+----+-----+
>>>>>>> | 1  |13.26|1.00|34.15|
>>>>>>> +----+-----+----+-----+
>>>>>>>          NB. especially for arbitrary
>>>>> input
>>>>>>>
>>>>>>>          'X Y' =. 2 16384 ?...@$ 1000
>>>>>>>          rank'X +/ obta * Y';'X iconvolve Y'
>>>>>>> +----+-----+----+----+
>>>>>>> |rank|tm*sz|time|size|
>>>>>>> +----+-----+----+----+
>>>>>>> | 1  |1.41 |9.40|1.00|
>>>>>>> +----+-----+----+----+
>>>>>>> | 0  |1.00 |1.00|6.65|
>>>>>>> +----+-----+----+----+
>>>>>>>          NB. finally more efficient
>>>>>>>
>>>>>>>          'X Y' =. 16000 17000<@(?...@$)"(0) 1000
>>>>>>>          rank'X +/ obta * Y';'X iconvolve Y'
>>>>>>> +----+-----+----+-----+
>>>>>>> |rank|tm*sz|time|size |
>>>>>>> +----+-----+----+-----+
>>>>>>> | 0  |1.00 |4.13| 1.00|
>>>>>>> +----+-----+----+-----+
>>>>>>> | 1  |5.14 |1.00|21.21|
>>>>>>> +----+-----+----+-----+
>>>>>>>          NB. However ...
>>>>>>>
>>>>>>> If f/.@:(g/) would be replaced by special code
>>>>> like obta,
>>>>>> figures would be
>>>>>>> even (slightly?) more in favor of obta.
>>>>> --------------------------------------------------------------------
>>>>> -
>>>>> - For information about J forums see
>>>>> http://www.jsoftware.com/forums.htm
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> - For information about J forums see
>>>> http://www.jsoftware.com/forums.htm
>>>>
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

[Jprogramming] Inaccuracy of repeated squaring WAS: Convolution using FFT

Reply via email to