Hi, How can I make a python type that corresponds to the device built-in double2 both in that it has x,y fields _and_ aligned on 16 and not 8 bytes? I am passing it as an argument to the kernel that expects to receive double2 instead it receives whatever is derived from
k = int(1) l = int(2) # how do I align the following dbl2 = [('x','float64'), ('y','float64')] a2 = np.array((-0.5,-0.5), dtype=dbl2) ... kernel(k, l, a2, ...,arr.gpudata, block=(int(16),int(16),int(1))) It either crashes or doesn't access properly arr.gpudata. I think what happens is that a2 is not aligned when pushed into parameters stack as expected by the kernel declaration in CUDA: __global__ void kernel(int k, int l, double2 a2, ..., double2 *arr) { if instead, struct my_double2 {double x,y;}; __global__ void kernel(int k, int l, my_double2 a2, ..., double2 *arr) { then it works. What is the best way to pack arguments currently in PyCUDA? Thanks Igor _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda