Andreas Kloeckner <lists@...> writes:

> 
> Dear Keldor,
> 
> On Mon, 8 Aug 2011 23:32:12 -0600, <keldor@...> wrote:
> > I’m trying to divide a very large image (larger than device memory) into 
some number of slices for
> processing.  The problem is that the image may or may not divide evenly into 
> N 
slices, meaning that I either
> have to deal with slices of differing sizes, which complicates image 
coordinates even more than they
> already are, or else use some sort of padding.
> > 
> > Padding seems to be the best solution, however there is a nasty problem.  
Since there’s no way of copying a
> small array into a larger array (just copy the small array into the beginning 
of the larger array, and leave
> the remaining part, the padding, untouched), or of copying the first part of 
> a 
larger array back into the
> small array (the rest of the array is, again, padding and not needed), I have 
to either create an
> intermediate padded staging array and do extra copying, or else pad the 
> entire 
image array with
> numpy.resize, which is horribly slow and memory consuming for images of this 
size.
> > 
> > What I really need is a partial array to partial array copy mechanism.  I 
know that certain *limitations* of
> opencl 1.0 make this impossible with full generality due to the lack of an 
offset into the device buffer,
> but even with the device side offset fixed to 0, the byte_count is sufficient 
for what I need to do, since the
> padding can always be placed at the end of the slice.
> > 
> > I really need byte_count for a buffer<=>host transfer!
> 
> I didn't include byte_count because I think you can always achieve the
> same effect by obtaining a view of the numpy array you're copying.
> 
> I.e.
> 
> enqueue_copy(dev_buf, my_array, byte_count=16384)
> 
> would be equivalent to
> 
> enqueue_copy(dev_buf, my_array.flatten[:16384/my_array.dtype.itemsize])
> 
> Since both '.flatten' and view creation are O(1) operations (i.e. they
> don't copy data at all), this should do what you want, unless I'm
> misunderstanding you. (If so, please clarify.)
> 
> Thanks,
> Andreas
> 

The problem is that my_array.flatten[:16384] only returns an array with 
16384 elements if my_array is at least that large.  In my case, it's 
actually slightly smaller in the edge case, since I have to add a dead zone 
to the end of the image so that it slices into buffers of the same size. 
However, the input image is unsliced and thus doesn't have this dead zone, 
hence the final slice will go out of bounds and end up truncated.

This causes enqueue_copy to fail with an INVALID PARAMETER error, which I 
strongly suspect is because the source array and destination buffer are of 
different sizes.  Ideally, it would not fail, but rather simply copy 
min(source_size,dest_size) bytes, which is the desired result in this case.



_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to