I found a place to patch in pycuda-2011.1.2\pycuda\gpuarray.py,
class vec:
...
for base_name, base_type, counts in [
        ('char', np.int8, [1,2,3,4]),
        ('uchar', np.uint8, [1,2,3,4]),
        ('short', np.int16, [1,2,3,4]),
        ('ushort', np.uint16, [1,2,3,4]),
        ('int', np.uint32, [1,2,3,4]),
        ('uint', np.uint32, [1,2,3,4]),
        ('long', long_dtype, [1,2,3,4]),
        ('ulong', ulong_dtype, [1,2,3,4]),
        ('longlong', np.int64, [1,2]),
        ('ulonglong', np.uint64, [1,2]),
        ('float', np.float32, [1,2,3,4]),
        ('ulonglong', np.float64, [1,2]),
        ]:
...
Can I just change the second occurrence of ulonglong to double? Do I
need to recompile anything then?



On Mon, Nov 7, 2011 at 11:26 AM, Igor <rych...@gmail.com> wrote:
> Thanks. The reason I felt I had to define my own double2 in PyCUDA is
> that it's not among the types exposed by gpuarray.vec... if it is,
> then how do I use it because the following (second line) does not
> work?
>
> print gpuarray.vec.make_float3(1,2,3)
> print gpuarray.vec.make_double2(1,2)
>
> (1.0, 2.0, 3.0)
> Traceback (click to the left of this block for traceback)
> ...
> AttributeError: class vec has no attribute 'make_double2'
>
>
>
>
> On Mon, Nov 7, 2011 at 9:19 AM, Andreas Kloeckner
> <li...@informa.tiker.net> wrote:
>> On Sun, 6 Nov 2011 19:23:10 +1300, Igor <rych...@gmail.com> wrote:
>>> Hi,
>>> How can I make a python type that corresponds to the device built-in
>>> double2 both in that it has x,y fields _and_ aligned on 16 and not 8
>>> bytes? I am passing it as an argument  to the kernel that expects to
>>> receive double2 instead it receives whatever is derived from
>>>
>>> k = int(1)
>>> l = int(2)
>>>
>>> # how do I align the following
>>> dbl2 = [('x','float64'), ('y','float64')]
>>> a2 = np.array((-0.5,-0.5), dtype=dbl2)
>>> ...
>>> kernel(k, l, a2, ...,arr.gpudata, block=(int(16),int(16),int(1)))
>>>
>>> It either crashes or doesn't access properly arr.gpudata. I think what
>>> happens is that a2 is not aligned when pushed into parameters stack as
>>> expected by the kernel declaration in CUDA:
>>>
>>> __global__ void kernel(int k, int l, double2 a2, ..., double2 *arr) {
>>>
>>> if instead,
>>> struct my_double2 {double x,y;};
>>> __global__ void kernel(int k, int l, my_double2 a2, ..., double2 *arr) {
>>>
>>> then it works.
>>>
>>> What is the best way to pack arguments currently in PyCUDA?
>>
>> Try using the vector types:
>> http://documen.tician.de/pycuda/array.html#vector-types
>>
>> HTH,
>> Andreas
>>
>

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to