Re: Mersenne: NTT faster than FFT ?
At 07:41 PM 3/17/02 +0100, you wrote: >Usually, we do several NTT modulo different primes and then use the Chinese >Remainder Theorem to get t[0] ;...; t[i] ;...; t[l-1] modulo p, because the >terms in the central loop of the transform can exceed the largest integer >that fits in a word machine (for instance, 2^32-1) . That's one of the >reasons why NTT are slower than FTT. > >Why don't we compute these terms modulo p at each step of the central loop ? >Consequently, it would be useless to do several NTT and the Chinese >Remainder Theorem would not be required. If you can do computations modulo p directly there's no reason to use the Chinese remainder theorem. The down side is that if you can break p up into L primes, arithmetic mod p is much more than L times as expensive as arithmetic mod the smaller primes, at least for most p. Using the CRT means a mess when the convolution has finished but not in any intermediate step; that's the big advantage. Also, when talking strictly about Mersenne transforms, your prime p must have a large power-of-two root of 2, and so must all of your L primes. I don't think any of the 32-bit primes that are suitable for an NTT have this property. If you want to explore your idea further, here are some primes that may be useful. All are of the form i*2^e+1, and have the advantage that roots of unity out to some small order are "simple": iexponent FFT radixDWT size root - 007d43548abc4361 320 8 2^17 67 04235deab8321000 512 8 2^22 19 0051656f7fe405c1 2048 8 2^19 29 07ab47a650572a01 2112 8 2^16 29 06954fe21e3e8100 38432 N/A 37 290d741 179232 N/A5 >Even if I am wrong, the speed of NTT can be improved by using 64 bit >integers ( shorter NTT length ) supported by both new processors and new C >compilers, and by using Intel MMX instructions. The only operations MMX can perform at full 64-bit precision are shifts. Modular arithmetic using MMX is possible but extremely unwieldy; SSE would work better, but nowhere near as well as SSE floating point. My personal opinion is that NTTs will not be faster than FFTs until modular multiplication has the same throughput as floating point multiplication. Right now we're at least a factor of 3-5 away from that. Hope this helps, jasonp _ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: NTT faster than FFT ?
I might have found a way of speeding up NTT that are usually slower than FFT. Firts, as some people may not know Number Theoritic Transfoms ( NTT ) , I am going to remember few ideas about them. NTT are also called FFT modulo a prime. Instead of using complex numbers, NTT compute wxith integer modulo a prime p. Thus, all operations are done modulo p ( that is to say, in the ring of integers modulo p) . There are restrictions for the choice of p : - the length l of then NTT must divide (p-1) ; - the radix b must be lower than p ; - p must be prime and (p-1)highly composite. With 32 bit integers , a common choice is : p=13*2^28+1 ; b=2^28. So, the number M represented by an array t[0..(l-1)] of l elements is : M=t[0]+t[1]*b+...+t[l-1]*b^(l-1)=t[0]+t[1]*2^28+...+t[l-1]*(2^28)^(l-1) . Usually, we do several NTT modulo different primes and then use the Chinese Remainder Theorem to get t[0] ;...; t[i] ;...; t[l-1] modulo p, because the terms in the central loop of the transform can exceed the largest integer that fits in a word machine (for instance, 2^32-1) . That's one of the reasons why NTT are slower than FTT. Why don't we compute these terms modulo p at each step of the central loop ? Consequently, it would be useless to do several NTT and the Chinese Remainder Theorem would not be required. Even if I am wrong, the speed of NTT can be improved by using 64 bit integers ( shorter NTT length ) supported by both new processors and new C compilers, and by using Intel MMX instructions. _ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Primenet Manual testing forms
Hi, On Sat, 16 Mar 2002 20:40 -0500, Paul Victor Novarese wrote: > > I've been using a ksh script I wrote to grab exponents from the > Primenet manual testing forms via wget and feed them to my alphas' > worktodo.ini files... Recently I noticed this no longer was getting > any work. Looking at the logs, I see that the server is returning > the following: > > Error 19: Unsupported access - please contact [EMAIL PROTECTED] > > The URL I pass to wget looks something like this: > > >http://mersenne.org/cgi-bin/primenet_checkout.pl?UserID=novarese%26UserPW=notmyrealpassword%26compID=w00t%26cpuType=P6%26speed=200%26hours=24%26LL=OFF%26factor=OFF%26dcheck=ON%26days=1 > > (I use "%26" instead of "&" to avoid any ksh unpleasantness) > > I also get this error from when using lynx as my browser; however, > I am still able to submit completed work without any problems. > Maybe just a coincident, but to me it looks like PrimeNet's checkout script evaluates the User-Agent header: Using "wget --user-agent=..." or "lynx -useragent=..." works with 'Mozilla/...': 'Mozilla/4.78 [de] (X11; U; Linux 2.2.19 ppc; Nav)' 'Mozilla/4.76 [en] (X11; U; SunOS 5.8 sun4u)' 'Mozilla/' but results into "Error 19: Unsupported access - ..." for 'Glucas/2.9' 'Lynx/2.8.4dev.7 libwww-FM/2.14' 'Mozilla' 'Wget/1.8' Klaus _ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers