Hi,
How can I make a python type that corresponds to the device built-in
double2 both in that it has x,y fields _and_ aligned on 16 and not 8
bytes? I am passing it as an argument  to the kernel that expects to
receive double2 instead it receives whatever is derived from

k = int(1)
l = int(2)

# how do I align the following
dbl2 = [('x','float64'), ('y','float64')]
a2 = np.array((-0.5,-0.5), dtype=dbl2)
...
kernel(k, l, a2, ...,arr.gpudata, block=(int(16),int(16),int(1)))

It either crashes or doesn't access properly arr.gpudata. I think what
happens is that a2 is not aligned when pushed into parameters stack as
expected by the kernel declaration in CUDA:

__global__ void kernel(int k, int l, double2 a2, ..., double2 *arr) {

if instead,
struct my_double2 {double x,y;};
__global__ void kernel(int k, int l, my_double2 a2, ..., double2 *arr) {

then it works.

What is the best way to pack arguments currently in PyCUDA?
Thanks
Igor

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to