On Fri, 29 Apr 2011 15:53:05 -0400, Frédéric Bastien <[email protected]> wrote: > I agree with what you said. But I was trying to tell something else. > Here is an example that should make it more clear what I want to say: > > import pycuda.autoinit > import pycuda.gpuarray > import numpy > x = numpy.random.rand(5,4) > print x.shape #(5, 4) > print x.strides #(32, 8) > x_strided = numpy.random.rand(5,4)[::2] > print x_strided.shape #(3, 4) > print x_strided.strides #(64, 8) > x_fortran = numpy.asfortranarray(x) > print x_fortran.shape #(5, 4) > print x_fortran.strides #(8, 40) > > gx = pycuda.gpuarray.to_gpu(x) > assert gx.shape == x.shape > assert gx.strides == x.strides > assert (gx.get() == x).all() > gx_fortran = pycuda.gpuarray.to_gpu(x_fortran) > assert gx_fortran.shape == x_fortran.shape > assert gx_fortran.strides == x_fortran.strides > assert (gx_fortran.get() == x_fortran).all() > > gx_NOT_strided = pycuda.gpuarray.to_gpu(x_strided) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File > "/u/bastienf/repos/pycuda.git/build/lib.linux-x86_64-2.5/pycuda/gpuarray.py", > line 784, in to_gpu > result.set(ary) > File > "/u/bastienf/repos/pycuda.git/build/lib.linux-x86_64-2.5/pycuda/gpuarray.py", > line 222, in set > assert self.flags.forc > AssertionError > assert gx_NOT_strided.shape == x_strided.shape > assert gx_NOT_strided.strides != x_strided.strides > assert (gx_NOT_strided.get() == x_strided).all() > > As you see the last case don't work. I think that the attached patch > strided_transfer.diff should fix that.
It's not clear to me that this is a fix, or rather, it's not clear that what you're seeing is a problem. The error you're getting tells you that you are trying to copy a non-contiguous array. Since GPU<->host transfers are defined as byte-for-byte, they depend on the memory layout of the host array, so it seems invalid to just do the copy and thereby change the memory layout behind the user's back. I'm more comfortable making the user responsible for handing us a contiguous array, everything else seems ill-defined. > Suppose that I implement today slicing (gpu_array[::2]) that generate > strided gpuarray and modif all pycuda code to check if the input if > contiguous before working. Then all in pycuda will be consistent. And that's how things are. I've added contiguity checks (AFAICT) everywhere in PyCUDA/OpenCL that relies on it. > In that case, there is no code that will use this new gpu_array in > pycuda, but error will be raised if it get used. > > In that case, I still see a problem that will give bad result without > error or warning. I think that is problem WILL happen (this is why I > continue discussing it). > > The problem is that if one user made its own gpu code and forget to > modify it to check for the stride. Then if he start using the new > strided gpuarray with its own code, he will have bad result without > error/warning. It is easy to tell that it is the user fault(and it > is...) But in the case where there is a few people doing gpu code and > many using it(like in our lab) If the user that do the gpu code don't > have time to update all the code and some other user start using the > strided gpu array, it get harder to tell it is the user fault. > > Also think of people using the strided gpu array with pycula or other > library that tell they support the pycuda.gpuarray. In that case > everybody need to update the code at the same time to make this > work... > > I have a simple work around on this. Just do a class that inherit from > GpuArray. In the init call GpuArray.__init__ and after that rename the > self.gpudata to self.gpunddata. Also, don't allow GpuArray to have > stride that are not contiguous. So if someone use old code with > strided gpu array, it will raise an error(without good message, but we > could document it well). To make it a little bit easier on the > interface, we can keep the self.gpudata when the array is contiguous. > So people will be able to use the new class with old code if they > don't use the stride feature. > > I prefer 2 arrays and that we deprecate one instead of having silent > wrong error computed. As I said, I don't think it is hard or long to > do. I will try to do one next week. I finally see what you're saying, although I don't think that the two-class solution would help much. Most everybody might not even check the type of what they're getting. A better solution might be to change the GPUArray.gpudata attribute to assert contiguity, and create new .noncontig_data attribute that does not assert contiguity. (We'll have to think of a better name.) What do you think? > I will keep the list and you updated. I always thinked that PyOpenCL > had a similar python interface then PyCUDA. So I think the change to > PyCUDA could be redone to PyOpenCL without too much difficulty. But I > can be wrong as I didn't used it. > > >> I think we can compare the current python gpu base array object as > >> before numpy exit. i.e. There is many variable that are not directly > >> compatible, but very close. > > > > I disagree--GPUArray was explicitly conceived as a numpy array > > workalike, and it'll slowly grow to get closer to that goal. > > I was refering to the fact that in Theano currently we use our own > CudaNdarray object that is not directly compatible with your GpuArray. > There is also a different base type in CudaMat. At that level, I still > think we need to find a common base object that could be used by > everybody in the python gpu community. That will help to make code > more portable and will help to keep the community closer(instead of > being fragmented as now) Sure, something like this seems useful. One possibility might be to imitate the host-side array interface, rather than enforcing a common base type. On the other hand, what were your reasons for not using GPUArrays in Theano? Anything that could be fixed? Andreas
pgpLXkVq00ekt.pgp
Description: PGP signature
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
