Re: [PyCuda] pycuda memcpy_htod

Nicholas Tung Sun, 01 Mar 2009 13:24:59 -0800

It's good that non-linear arrays fail. As for non-contiguous ones, I think
it would be much better to check (it's just a flag check) instead of relying
on the user to read all of the documentation (I certainly don't).


Since you probably want to quality control your source (and my sandbox is
messy), attached is a suggested solution--rename memcpy_htod to something
like memcpy_htod_unchecked (expose to the user), create the function below,
and document it appropriately. Attached is a simple python test.

void py_memcpy_htod(CUdeviceptr dst, py::object src, py::object stream_py)
  {
    PyArrayObject *arr = (PyArrayObject *)src.ptr();
    if (PyArray_Check(arr) && !(arr->flags & 1)) {
        throw std::runtime_error("[memcpy] Array not C contiguous; "
            "see \"Device Interface Reference Documentation\"");
    }
    py_memcpy_htod_unchecked(dst, src, stream_py);
  }

regards,
Nicholas

On Sun, Mar 1, 2009 at 12:12, Andreas Klöckner <[email protected]>wrote:

> On Sonntag 01 März 2009, you wrote:
> > > What's the failure? If it's something non-intuitive, we should catch it
> > > in PyCuda and give a nicer warning.
> >
> > The failure is the wrong data is transferred to the kernel; it appeared
> to
> > be something like the array transposed (which, needless to say, can be
> very
> > bad, particularly if loop bounds are taken from corrupted memory).
>
> numpy supports arbitrary strides in its arrays, which, among other things,
> can
> make them column- or row-major (ie. have Fortran or C order). GPUArray
> currently has no stride support whatsoever. In the long run, having stride
> support in GPUArray would likely be desirable. Introducing strides would
> allow
> us to introduce indexing in the same way that numpy allows.
>
> Further, numpy allows many types of funky arrays (non-contiguous, for
> example). PyCuda currently does very little to support these funky arrays,
> but
> at least it doesn't behave incorrectly:
>
> >>> import pycuda.autoinit
> >>> import pycuda.gpuarray as ga
> >>> import numpy
> >>> z = numpy.zeros((10,10), dtype=numpy.float32)
> >>> ga.to_gpu(z[:,2:3])
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "/home/kloeckner/src/env/lib/python2.5/site-packages/pycuda-0.93beta-
> py2.5-linux-x86_64.egg/pycuda/gpuarray.py", line 401, in to_gpu
>    result.set(ary, stream)
>  File "/home/kloeckner/src/env/lib/python2.5/site-packages/pycuda-0.93beta-
> py2.5-linux-x86_64.egg/pycuda/gpuarray.py", line 91, in set
>    drv.memcpy_htod(self.gpudata, ary, stream)
> TypeError: expected a single-segment buffer object
>
> This is easy to work around for now--a simple .copy() and things work.
>
> > Looks like C_CONTIGUOUS is what we're looking for. The numpy
> documentation
> > mentions this and a possibly applicable function call:
> > http://numpy.scipy.org/numpydoc/numpy-13.html#marker-59740
>
> In a sense, PyCuda merely did what it was asked to do, which is transfer
> the
> numpy array in the exact same layout that it had on the host. On the one
> hand,
> I intentionally transfer Fortran-layout arrays onto the GPU in some of my
> code, and I think that's perfectly fine behavior.
>
> You have a point in that, at present, none of the stride information in the
> numpy array is preserved in a GPUArray copy, which means that
> gpuarray.to_gpu(a).get() may result in many funny things, but only for C-
> contiguous arrays will you get back out what you put in. This is a bug and
> needs to be fixed, but the fix would likely be a part of the stride
> implementation cited above.
>
> If, in the meantime, you want to phrase a warning for the documentation,
> I'd
> be happy to merge that.
>
> Andreas
>
>

#!/usr/bin/env python

import codepy, pycuda, numpy
import pycuda.autoinit
import pycuda.driver as drv
import pycuda.gpuarray as ga

b = buffer("hello world")
z = numpy.zeros((10,10))
d = numpy.concatenate([z, numpy.zeros((10, 0))], axis=1)
print(z.flags)
print(d.flags)
mem = drv.mem_alloc(300)
try:
    drv.memcpy_htod(mem, z)
    drv.memcpy_htod(mem, b)
except:
    print("ERROR - test failed; first memcpy's should work")
else:
    drv.memcpy_htod(mem, d) # should throw an exception
    print("ERROR - test failed")

_______________________________________________
PyCuda mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] pycuda memcpy_htod

Reply via email to