On Thu, 22 Sep 2011 02:41:05 +0100, Irwin Zaid <irwin.z...@physics.ox.ac.uk> 
wrote:
> Hi all,
> 
> I have a quick question about how long one should expect it to take to 
> regenerate kernels that have already been compiled (and therefore 
> presumably cached).
> 
> Some quick tests indicate that something like 40 ms is common when using 
> a SourceModule object for compilation. Is this about right? It seems 
> quite slow to me.

I've never measured this on the assumption that nobody would be limited
by that rate. Are you using #include in your code? If so, that would
likely slow down the cache retrieval because it needs to see if the
included files have changed. In addition, each cache retrieval does go
to disk. Finally, the binary has to be accepted by the Nvidia
runtime--I'm not sure how much time that takes. I do know that in the
worst case this last part might also include a call to ptxas (if you're
passing a mismatched binary)--but that shouldn't be happening.

If you'd like to improve this time, it'd likely be good if you could
measure which of these components is actually using the most time. You
can just throw a few time/print statements into pycuda/compiler.py. Once
you know, let the list know, and we'll consider how to fix this.

HTH,
Andreas

Attachment: pgpnEYkHzRNl1.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to