Dnia 2010-09-28, wto o godzinie 00:29 -0700, jmcarval pisze: > Thanks for your reply. > I've read the first thread you mention, that ends without a solution > http://pycuda.2962900.n2.nabble.com/PyCUDA-pycuda-test-failures-tp5320194p5320194.html > > Maybe I'm doing a huge mistake but it does not seem to be a precision > detail. > The following code (a simplification of test_gpuarray), returns 30 from the > CPU and 14 from the GTX480, either with integer, float32 or float64. > I don't get it. Can anybody explain me what I'm doing wrong please? > Thanks > > import pycuda.autoinit > import numpy > import pycuda.gpuarray as gpuarray > from pycuda.curandom import rand as curand > > a = numpy.array([1,2,3,4])#.astype(numpy.float32) > a_gpu = gpuarray.to_gpu(a) > b = a > b_gpu = gpuarray.to_gpu(b) > > dot_ab = numpy.dot(a, b) > > dot_ab_gpu = gpuarray.dot(a_gpu, b_gpu).get() > > print "CPU dot product:", dot_ab > print "GPU dot product:", dot_ab_gpu > >
I have idea for (maybe) checking whether problem is with PyCUDA, CUDA toolkit, or driver. Can you force PyCUDA to generate not sm_20 code, but 1x? I have found that it is determined in line 190 of file pycuda/compiler.py: arch = "sm_%d%d" % Context.get_device().compute_capability() Try to change it to arch = "sm_10" and so on, and check whether you get incorrect 14 in such a case. If there is simpler way of changing architecture to which PyCUDA generates code, feel free to use it and share this information. Regards. -- Tomasz Rybak <bogom...@post.pl> GPG/PGP key ID: 2AD5 9860 Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860 http://member.acm.org/~tomaszrybak
signature.asc
Description: This is a digitally signed message part
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda