I’m trying to divide a very large image (larger than device memory) into some
number of slices for processing. The problem is that the image may or may not
divide evenly into N slices, meaning that I either have to deal with slices of
differing sizes, which complicates image coordinates even more than they
already are, or else use some sort of padding.
Padding seems to be the best solution, however there is a nasty problem. Since
there’s no way of copying a small array into a larger array (just copy the
small array into the beginning of the larger array, and leave the remaining
part, the padding, untouched), or of copying the first part of a larger array
back into the small array (the rest of the array is, again, padding and not
needed), I have to either create an intermediate padded staging array and do
extra copying, or else pad the entire image array with numpy.resize, which is
horribly slow and memory consuming for images of this size.
What I really need is a partial array to partial array copy mechanism. I know
that certain *limitations* of opencl 1.0 make this impossible with full
generality due to the lack of an offset into the device buffer, but even with
the device side offset fixed to 0, the byte_count is sufficient for what I need
to do, since the padding can always be placed at the end of the slice.
I really need byte_count for a buffer<=>host transfer!
Any ideas?
_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl