Dear Efren,

Your numpy timings are incredible.

Array size: 4,000,000
GPU array time: 0.001961s
numpy array time: 0.000001s

This 1 microsecond seems to be rather constant.

start.record()
numpy.sum(a)/a.size
end.record()
end.synchronize()

Could it be that this timing code is for asynchronous GPU calls? Try this:

import timeit
t=timeit.Timer( setup="from __main__ import numpy,a" ,
        stmt="numpy.sum(a)/a.size")
print "Numpy timing", t.timeit(1000)/1000,"s"

Same approach could be interesting for your GPU calls if you want to get the python walltimes.

Best,

Jon

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to