thank you. These comments are very helpful. I don't understand what you mean by the following:
"You are also using full expression to calculate index: const int i = blockDim.x*blockIdx.x + threadIdx.x; while you are using only one block - this makes your code harder to analyse." How else should I be calculating the index? Thanks again. On 12/13/2010 3:49 PM, Tomasz Rybak wrote:
You are also using full expression to calculate index: const int i = blockDim.x*blockIdx.x + threadIdx.x; while you are using only one block - this makes your code harder to analyse.
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda