Hi guys and gals, I'm wondering how I can measure (and afterwards study and improve) the execution time of the kernel function/s in the device as well as the host functions/code using PyCUDA? Any suggestions? Right now I'm just using the *datetime* function of Python and then subtracting the time before and after the kernel calls. In our Intel Xeon setup this is about several hundred microseconds, but I want to be sure if this is an accurate way of getting the run time. :)
Regards, ./francis
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda