Hi,
i'm trying to pass a 3D array to a kernel. The kernel should keep, in parallel,
all the vector of the stack and multiply it by 2.
But I have this error: error:
expression must have pointer-to-object type
I know that, obviously, C and python type are differently. In my opinion in the
kernel i should declare a triple pointer, but reading some pycuda examples, i
suppose that for C, each numpy array is seen as a single pointer.
This is my code:
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy
import time
from pycuda.gpuarray import to_gpu
a = numpy.random.randn(10,10,10)
a = a.astype(numpy.float32)
a_gpu=to_gpu(a)
mod = SourceModule("""
__global__ void doublify(float *a)
{
int k;
int idx=threadIdx.x +threadIdx.y*blockDim.y;
for(k=0;k<10;k++)
a[idx][k]*= 2;
}
""")
func = mod.get_function("doublify")
func(a_gpu, block=(10,10,1),grid=(1,1))
print a
print "Matrice moltiplicata per 2 :\n"
print a_gpu.get()
Thanks
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda