Assuming you're talking about doing it on a conventional computer
(1 or few processors, fixed word length, etc.), the method you propose
is at least O(2^p * p ) per squaring.  This is dominated by
the first exponential term, which must be avoided at all costs.
Huge constant factors can be accepted to get down from exponential time 
to polynomial time; brute-force ("schoolboy") multiplication is "only"
O(p^2).  (There squaring has about a 2:1 advantage over multiplication.)
The minimum possible for fixed word length is bounded by the
FFT approach at O(p * log(p) * log(log(p) ) from above, and
O(P * log(p) / (log(log(p)))^2 ) from below under certain severe restrictions.
These two limits are not far apart since log(log(p)) hardly moves at all;
p         ln(p)                ln(ln(p))
15           2.708050201102  0.9962288929514
1,000,000    13.81551055796  2.625791914476
5,260,000    15.47564158471  2.739267277143
33,000,000   17.31201811943  2.851400949303
800,000,000  20.50012228563  3.020430851279

The average length of the numbers to be added is p-1/2 bits;
for a word length w, each addition takes O(log(n)/w) operations.
And there are O(2^p) such operations.
Each term must be generated, and the amount of memory that must be allocated
for the counter is large.
Take p around 33,000,000; the counter requires 33,000,000 bits storage.  
This will not fit in second level cache; 4MB cache PCs are rare.
Storing the sum will take at least that, if you apply the modulo along
the way; otherwise, twice that.

The count n will take O(2^33,000,000) clocks.
Compare that count, to the number of picoseconds since native Americans
put the early pictographs on the rocks of Sedona Arizona:
7000 years *365.24 days/year *24 hours/day *(3600 * 10^12) picoseconds/hour
=2.2*10^23 picoseconds =~ 2^77.55 picoseconds.  A picosecond is the time
it takes light to travel in vacuum the diameter of a 300 micron dot.

At 10:47 AM 1999/07/22 -0600, <[EMAIL PROTECTED]> wrote:
>Just was thinking the other day about this... Forgive me if its been
>discussed before etc.
>
>Aren't there other "better" algorithms for finding the square of a number
>besides using an FFT? Sure an FFT is great for multiplying two different
>numbers, but for squaring a number isn't the case a little different. Plus,
>going off what I've read on FFT's, it runs in O(n log n) time.
>
>So wouldn't a summation algorithm work better.
>
>For example, to find the square of a number k take the sum of n=1 to n=k of
>2n-1.
>
>So 3**2 is 1+3+5 = 9
>4**2 is 1+3+5+7 = 16
>etc...
>
>So the sqaure performs in O(n) time. Plus, you could already throw in the
>sub 2 by starting at -1 instead of 1. And figuring out the mod is just as
>easy.
>
>k is the sum of n=2**p-1 to n=k of 2n-1 (This would give k**2 - 2**p-1).
>
>The only hassle is adding *big* numbers up, but I'm not sure how the
>ramifications of doing carries across boundaries and such would factor into
>the equation etc.
>
>Just a thought.
>
>-Jeremy
>_________________________________________________________________
>Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
>Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
>
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to