Dnia 2010-09-28, wto o godzinie 23:51 -0400, Andreas Kloeckner pisze:
> On Tue, 28 Sep 2010 23:56:47 +0200, Tomasz Rybak <bogom...@post.pl> wrote:
> > I have idea for (maybe) checking whether problem is with PyCUDA,
> > CUDA toolkit, or driver.
> > Can you force PyCUDA to generate not sm_20 code, but 1x?
> > I have found that it is determined in line 190 of file
> > pycuda/compiler.py:
> > arch = "sm_%d%d" % Context.get_device().compute_capability()
> > Try to change it to
> > arch = "sm_10"
> > and so on, and check whether you get incorrect 14 in such
> > a case.
> > 
> > If there is simpler way of changing architecture to which
> > PyCUDA generates code, feel free to use it and share this
> > information.
> 
> arch can be overridden from the SourceModule arguments:
> http://documen.tician.de/pycuda/driver.html#module-pycuda.compiler
> 

Yes, but code from this thread was calling GPUArray.dot,
which was calling ReductionKernel, and in none of those
I have seen ability to pass arch='sm_10' argument.

I have checked and
dot_ab_gpu = gpuarray.dot(a_gpu, b_gpu, arch='sm_11').get()
gives error.


-- 
Tomasz Rybak <bogom...@post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to