Walter White <homerun4...@gmail.com> writes: > Hello, > > I have a question and hope that you can help me. > I am trying to find the bottleneck in my code but I can't get a > grip at the moment. > > For a while I thought it was the writes to global memory > At the moment I am using an early "return" statement in my > code to skip parts of the code, e.g. a for-loop. > > Now I am wondering if this is working at all. > Could it be that the code exits even way before > the "return" statement when the compiler recognizes that > calculations done in a for-loop are not written to > global memory or used anywhere else?
The real way to tell is to look at the PTX. But, generally, yes, if you don't write results to global, I think the Nv compiler will get rid of your entire kernel. Andreas _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda