Hi Jayanth, I can run a 8192x8192 transform on a Tesla C2050 without problems. I think you are limited by the available video memory, see my previous message in this thread --- a 8192x4096 buffer takes 250Mb, and you have to factor in the temporary buffers PyFFT creates.
By the way, I would recommend you to switch from PyFFT to Reikna (http://reikna.publicfields.net). PyFFT is not supported anymore, and Reikna includes its code along with some additional features and optimizations (more robust block/grid size finder, temporary array management, launch optimizations and so on). Your code would look like: import numpy import reikna.cluda as cluda from reikna.fft import FFT api = cluda.cuda_api() thr = api.Thread.create() # Or, if you want to use an external stream, # # cuda.init() # context = make_default_context() # stream = cuda.Stream() # thr = api.Thread(stream) data = numpy.ones((4096, 4096), dtype = numpy.complex64) gpu_data = thr.to_device(data) #converting to gpu array fft = FFT(data).compile(thr) fft(gpu_data, gpu_data) result = gpu_data.get() print result On Fri, Dec 6, 2013 at 3:43 PM, Jayanth Channagiri <[email protected]> wrote: > Dear Ahmed > > Thank you for the resourceful reply. > > But the CUFFT limit is ~2^27 and also in the benchmarks on the CUFFT reach > upto 2^25. In my case, I am able to reach only upto 2^24. In some way, I am > missing another factor. Is this limited by my GPU's memory? > And also, in the same table, you can see for "Maximum width and height for a > 2D texture reference bound to a CUDA array " is 65000*65000 which is way too > high compared to mine. My GPU has a computing capacity of 2.x. > Thank you for the idea of performing two separate sequentual 1D FFTs. I will > shed more light on it. The thing is my problem doesn't stop at 2D. My goal > is to perform 3D FFT and I am not sure if I can calculate that way. > > > For others in the list, here I am sending the complete traceback of the > error message. > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/usr/lib/python2.7/dist- > packages/spyderlib/widgets/externalshell/sitecustomize.py", line 493, in > runfile > execfile(filename, namespace) > File "/home/jayanth/Dropbox/fft/fft1d_AB.py", line 99, in <module> > plan.execute(gpu_data) > File > "/usr/local/lib/python2.7/dist-packages/pyfft-0.3.8-py2.7.egg/pyfft/plan.py", > line 271, in _executeInterleaved > batch, data_in, data_out) > File > "/usr/local/lib/python2.7/dist-packages/pyfft-0.3.8-py2.7.egg/pyfft/plan.py", > line 192, in _execute > self._tempmemobj = self._context.allocate(buffer_size * 2) > > pycuda._driver.MemoryError: cuMemAlloc failed: out of memory > > Also, here is the simple program to which I was addressing to calculate FFT > using pyfft : > from pyfft.cuda import Plan > import numpy > import pycuda.driver as cuda > from pycuda.tools import make_default_context > import pycuda.gpuarray as gpuarray > > cuda.init() > context = make_default_context() > stream = cuda.Stream() > > plan = Plan((4096, 4096), stream=stream) #creating the plan > data = numpy.ones((4096, 4096), dtype = numpy.complex64) #My data with just > ones to calculate the fft for single precision > gpu_data = gpuarray.to_gpu(data) #converting to gpu array > plan.execute(gpu_data)#calculating pyfft > result = gpu_data.get() #the result > > This is just a simple program to calculate the FFT for an array of 4096 * > 4096 in 2d. It works well for this array or a smaller array. As soon after I > increase it to the higher values like 8192*8192 or 8192*4096 or anything, it > gives an error message saying out of memory. > So I wanted to know the reason behind it and how to overcome. > You can execute the same code and kindly let me know if you have the same > limits in your respective GPUs. > > Thank you > > > > ________________________________ > Date: Thu, 5 Dec 2013 20:27:45 -0500 > Subject: Re: [PyCUDA] cuMemAlloc failed: out of memory > From: [email protected] > To: [email protected] > CC: [email protected] > > > I ran into a similar issue: > http://stackoverflow.com/questions/13187443/nvidia-cufft-limit-on-sizes-and-batches-for-fft-with-scikits-cuda > > The long and short of it is that CUFFT seems to have a limit of > approximately 2^27 elements that it can operate on, in any combination of > dimensions. In the StackOverflow post above, I was trying to make a plan for > large batches of the same 1D FFTs and hit this limitation. You'll also > notice that the benchmarks on the CUFFT site > https://developer.nvidia.com/cuFFT go up to sizes of 2^25. > > I hypothesize that this is related to the 2^27 "Maximum width for a 1D > texture reference bound to linear memory" limit that we see in Table 12 of > the CUDA C Programming Guide > http://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities. > > So since 4096**2 is 2^24, increasing to 8096 by 8096 gets very close to the > limit, even though you'd think 2D FFTs would not be governed by the same > limits as 1D FFT batches. > > You should be able to achieve 8096 by 8096 and larger 2D FFTs by performing > two separate sequentual 1D FFTs, one horizontal and the other vertical. The > runtimes should nominally be the same (they are for CPU FFTs), and the > answer will be the same, up to machine precision. > > > On Thu, Dec 5, 2013 at 9:53 AM, Jayanth Channagiri <[email protected]> > wrote: > > Hello > > I have a NVIDIA 2000 GPU. It has 192 CUDA cores and 1 Gb memory. > GB GDDR5 > > I am trying to calculate fft by GPU using pyfft. > I am able to calculate the fft only upto the array with maximum of 4096 x > 4096. > > But as soon after I increase the array size, it gives an error message > saying: > pycuda._driver.MemoryError: cuMemAlloc failed: out of memory > > Can anyone please tell me if this error means that my GPU is not sufficient > to calculate this array? Or is it my computer's memory? Or a programming > error? What is the maximum array size you can achieve with GPU? > Is there any information of how else can I calculate the huge arrays? > > Thank you very much in advance for the help and sorry if it is too > preliminary question. > > Jayanth > > > > > > _______________________________________________ > PyCUDA mailing list > [email protected] > http://lists.tiker.net/listinfo/pycuda > > > > _______________________________________________ > PyCUDA mailing list > [email protected] > http://lists.tiker.net/listinfo/pycuda > _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
