Hi Andrea, On Wed, Jul 11, 2012 at 10:25 PM, Andrea Cesari <[email protected]> wrote: > __global__ void gpu_kernel(int *corrGpu,int *aMod,int *b,int *kernelSize_h) > { > int j,step1=kernelSize_h[0]/2; // <--- ... > """)
When I remove /2 where the arrow points, I get results identical with the CPU version. Are you sure it is necessary there? > About your advise: when i do: int idx = threadIdx.x+step, idx doesn't start > from step1? so when j=0 idx-step1+j =0 ? it's wrong? Yes, sorry, that was my mistake. Everything is correct in this part. _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
