I’m trying to divide a very large image (larger than device memory) into some 
number of slices for processing.  The problem is that the image may or may not 
divide evenly into N slices, meaning that I either have to deal with slices of 
differing sizes, which complicates image coordinates even more than they 
already are, or else use some sort of padding.

Padding seems to be the best solution, however there is a nasty problem.  Since 
there’s no way of copying a small array into a larger array (just copy the 
small array into the beginning of the larger array, and leave the remaining 
part, the padding, untouched), or of copying the first part of a larger array 
back into the small array (the rest of the array is, again, padding and not 
needed), I have to either create an intermediate padded staging array and do 
extra copying, or else pad the entire image array with numpy.resize, which is 
horribly slow and memory consuming for images of this size.

What I really need is a partial array to partial array copy mechanism.  I know 
that certain *limitations* of opencl 1.0 make this impossible with full 
generality due to the lack of an offset into the device buffer, but even with 
the device side offset fixed to 0, the byte_count is sufficient for what I need 
to do, since the padding can always be placed at the end of the slice.

I really need byte_count for a buffer<=>host transfer!

Any ideas?
_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to