Hi, all,
I have some code that I would like to contribute to pycuda. What would
the preferred way of doing so be? Create a branch in git?
I have been working on some generalizations of the 'elementwise'
functionality recently. Specifically, I have made a GriddedKernel class,
that operates on nd-arrays, and handles computation of grid indices. As
a simple example, this allows one to write an outer product kernel along
the lines of:
GriddedKernel(
arguments = 'float32[:,:] out, float32[:] x, float32[:] y',
body = 'out[i] = x[xi] * y[yi];')
Furthermore, I have created a StencilKernel class. It takes an arbitrary
stencil as input, and creates an unrolled kernel from that. I am using
it for a watershed segmentation algorithm at the moment, but a simpler
example would be something like a laplacian:
StencilKernel(
stencil = laplacian3x3,
arguments = 'float32[512,512] out, float32[512,512] in',
loop_start = 'float32 o = 0',
loop_body = 'o += in[j] * weight[j];',
loop_end = 'out[i] = o')
The stencilkernel contains convenience functions for allocating and
transferring padded memory, so the stencil can always read safely, and
one can easily implement boundary conditions of ones choice.
As you might have noticed, I have deviated from the elementwise way of
specifying arguments. numpy dtypes get translated into c-types (i hate
the impicitness of c-types), plus, one can specify size and
dimensionality constraints on arguments, which by default are translated
into runtime type/shape checks.
However, there is still lots of work to be done. I am quite new to all
this cuda stuff, and I bet my code could be greatly improved, and
expanded to use cases that have not yet occurred to me. For instance, my
kernel code is still quite naive, and does not use shared memory. Also,
I have only covered 2d and 3d use cases so far, but by looping over the
larger strides, arbitrary nd-kernels could be created. My template code
is a mess and I should probably use codepy instead. And so on...
If there is anybody willing to help me advance this, id love to create a
git branch for it and try my best document and clean up my code, and
integrate it with pycuda style conventions (perhaps create an
elementwise branch that adheres to the same interface, and so on?). But
if its just me being excited about this, I probably wont bother. Even if
you dont want to help directly, your thoughts and comments are most welcome.
Eelco
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda