Also, are you using numpy with MKL? Those numpy times are really fast. -----Original Message----- From: pycuda-boun...@tiker.net [mailto:pycuda-boun...@tiker.net] On Behalf Of Serra, Mr. Efren, Contractor, Code 7542 Sent: Monday, April 09, 2012 4:26 PM To: 'Eli Stevens (Gmail)' Cc: pycuda@tiker.net Subject: Re: [PyCUDA] numpy.sum 377x faster than gpuarray.sum
Array size: 4000 GPU array time: 0.000420s numpy array time: 0.000001s Array size:40,000 GPU array time: 0.001648s numpy array time: 0.000002s Array size: 400,000 GPU array time: 0.000576s numpy array time: 0.000002s Array size: 4,000,000 GPU array time: 0.001961s numpy array time: 0.000001s Eli, I have just started to experiment with PyCUDA and was hoping to use it to do mean, standard deviation of some atmospheric data; however, the numbers above don't show much promise. Efren A. Serra (Contractor) DeVine Consulting, Inc. Naval Research Laboratory Marine Meteorology Division 7 Grace Hopper Ave., STOP 2 Monterey, CA 93943 Code 7542 Office: 831-656-4650 -----Original Message----- From: Eli Stevens (Gmail) [mailto:wickedg...@gmail.com] Sent: Monday, April 09, 2012 2:11 PM To: Serra, Mr. Efren, Contractor, Code 7542 Cc: pycuda@tiker.net Subject: Re: [PyCUDA] numpy.sum 377x faster than gpuarray.sum There are fixed startup costs that do not amortize well over only 400 elements. What happens when you vary the size of the array over several orders of magnitude? Eli On Mon, Apr 9, 2012 at 2:05 PM, Serra, Mr. Efren, Contractor, Code 7542 <efren.serra....@nrlmry.navy.mil> wrote: > import numpy > """ > """ > import pycuda.driver as cuda > import pycuda.tools > import pycuda.gpuarray as gpuarray > import pycuda.autoinit, pycuda.compiler > > a=numpy.arange(400) > a_gpu=gpuarray.arange(400,dtype=numpy.float32) > > start=cuda.Event() > end=cuda.Event() > start.record() > gpuarray.sum(a_gpu).get()/a.size > end.record() > end.synchronize() > print "GPU array time: %fs" %(start.time_till(end)*1e-3) > > start.record() > numpy.sum(a)/a.size > end.record() > end.synchronize() > print "numpy array time: %fs" %(start.time_till(end)*1e-3) > > GPU array time: 0.000377s > numpy array time: 0.000001s > > Efren A. Serra (Contractor) > DeVine Consulting, Inc. > Naval Research Laboratory > Marine Meteorology Division > 7 Grace Hopper Ave., STOP 2 > Monterey, CA 93943 > Code 7542 > Office: 831-656-4650 > > > _______________________________________________ > PyCUDA mailing list > PyCUDA@tiker.net > http://lists.tiker.net/listinfo/pycuda _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda