I found a place to patch in pycuda-2011.1.2\pycuda\gpuarray.py, class vec: ... for base_name, base_type, counts in [ ('char', np.int8, [1,2,3,4]), ('uchar', np.uint8, [1,2,3,4]), ('short', np.int16, [1,2,3,4]), ('ushort', np.uint16, [1,2,3,4]), ('int', np.uint32, [1,2,3,4]), ('uint', np.uint32, [1,2,3,4]), ('long', long_dtype, [1,2,3,4]), ('ulong', ulong_dtype, [1,2,3,4]), ('longlong', np.int64, [1,2]), ('ulonglong', np.uint64, [1,2]), ('float', np.float32, [1,2,3,4]), ('ulonglong', np.float64, [1,2]), ]: ... Can I just change the second occurrence of ulonglong to double? Do I need to recompile anything then?
On Mon, Nov 7, 2011 at 11:26 AM, Igor <rych...@gmail.com> wrote: > Thanks. The reason I felt I had to define my own double2 in PyCUDA is > that it's not among the types exposed by gpuarray.vec... if it is, > then how do I use it because the following (second line) does not > work? > > print gpuarray.vec.make_float3(1,2,3) > print gpuarray.vec.make_double2(1,2) > > (1.0, 2.0, 3.0) > Traceback (click to the left of this block for traceback) > ... > AttributeError: class vec has no attribute 'make_double2' > > > > > On Mon, Nov 7, 2011 at 9:19 AM, Andreas Kloeckner > <li...@informa.tiker.net> wrote: >> On Sun, 6 Nov 2011 19:23:10 +1300, Igor <rych...@gmail.com> wrote: >>> Hi, >>> How can I make a python type that corresponds to the device built-in >>> double2 both in that it has x,y fields _and_ aligned on 16 and not 8 >>> bytes? I am passing it as an argument to the kernel that expects to >>> receive double2 instead it receives whatever is derived from >>> >>> k = int(1) >>> l = int(2) >>> >>> # how do I align the following >>> dbl2 = [('x','float64'), ('y','float64')] >>> a2 = np.array((-0.5,-0.5), dtype=dbl2) >>> ... >>> kernel(k, l, a2, ...,arr.gpudata, block=(int(16),int(16),int(1))) >>> >>> It either crashes or doesn't access properly arr.gpudata. I think what >>> happens is that a2 is not aligned when pushed into parameters stack as >>> expected by the kernel declaration in CUDA: >>> >>> __global__ void kernel(int k, int l, double2 a2, ..., double2 *arr) { >>> >>> if instead, >>> struct my_double2 {double x,y;}; >>> __global__ void kernel(int k, int l, my_double2 a2, ..., double2 *arr) { >>> >>> then it works. >>> >>> What is the best way to pack arguments currently in PyCUDA? >> >> Try using the vector types: >> http://documen.tician.de/pycuda/array.html#vector-types >> >> HTH, >> Andreas >> > _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda