Hi,

I hit a similar problem when porting Theano to a newer version of cuda (the
one that support Fermi card). The problem is not the card, but the
compiler/card combination. What happen is that newer nvcc version make more
code optimization and it break some old assumption that many people did. The
new optimization is based on the new assumption that they can move the order
of operation between wrap if their is no synchronization barrier. I think
this was not a problem on pre Fermi card as the wrap was executed
sequentially, but that is not always true on the Fermi card(I'm not certain
for the pre Fermi, but I'm sur for the Fermi). The volatile keyword tell the
compiler that the value can change between thread operation and so it don't
do the new optimization.

Did I explain clearly enough what happened?

Frédéric Bastien

p.s. Other topic. I remember I told I would start pycuda modification to
have a strided ndarray on the gpu, but I got delayed. It is still on my todo
list when I got some time.

On Fri, Oct 22, 2010 at 3:02 AM, Julien Cornebise <
julien.corneb...@gmail.com> wrote:

> Awesome, Tomasz, works like a charm, thank you for this trick ! Sorry
> for the delay in the reply, this mail passed below my radar...
>
> As confirmed by Andreas (thanks for having tried it !), on my machine
> test_gpuarray.py does not make any error any more, that's great. And
> same goes for the "dot array" test mentioned earlier in the mailing
> list. Bril-liant.
> I'm baffled as I don't understand the change in detail, but I don't
> pretend to be well equipped for that -- still, if you want to explain
> what it does/what changed, I'll be extremely happy to see a light
> shined !
>
> Cheers,
>
> Julien
>
>
>
>
> On Mon, Oct 18, 2010 at 9:37 AM, Ian Ozsvald <i...@ianozsvald.com> wrote:
> > Just as a confirmation, I'm back at the office (after a month away)
> > and I've updated to:
> > * latest pyCUDA (0.94.2)
> > * NVIDIA Win XP 32bit driver 260.89 WHQL final release
> > * CUDA 3.2RC 32 bit (september 2010)
> > * on Win XP, 32 bit
> > and I'm using a new EVGA GTX 480 (a replacement for my last unit which
> > developed memory errors).
> >
> > Installation/building worked first time, test_cumath, test_driver,
> > test_gpuarray run with no errors.
> >
> > Cheers,
> > Ian.
> >
> > On 16 October 2010 22:01, Andreas Kloeckner <li...@informa.tiker.net>
> wrote:
> >> Hi Tomasz, all,
> >>
> >> On Fri, 15 Oct 2010 19:27:36 +0200, Tomasz Rybak <bogom...@post.pl>
> wrote:
> >>> Can anyone with Fermi (GTX 460, 470, 480 - are there other Fermi
> cards?)
> >>> tell whether attached patch solves problems with GPUArray on Fermi?
> >>> There has been discussion here on this list (started on 2010-09-27
> >>> by jmcarval) about problems with GPUArray. In summary,
> >>> test/test_gpuarray.py failed four times on Fermi.
> >>>
> >>> I have send this patch to mailing list on 2010-10-01, but got no
> >>> reply whether it works or not.
> >>
> >> Sorry for taking a while to reply to stuff recently. I am in the process
> >> of getting settled into a new job and every once in a while, work
> >> (especially teaching) is a bit much at the moment. When that happens, I
> >> disappear for a little while. This will likely also happen in the
> >> future, but don't worry, I'm around, and once the workload drops a bit,
> >> I come back. :)
> >>
> >> Next, thank you very much for the careful analysis you've done on this
> >> bug. Especially given what I've said above, this was super-helpful and
> >> much appreciated.
> >>
> >> Julien Cornebise had previously given me access to one machine where the
> >> issue was reproducible (Thanks, Julien!), and just now I verified that
> >> a) the problem was still present before I applied your patch and b) that
> >> your patch seems to fix the issue. (Both Joao's simple example and the
> >> test suite.) This leads me to believe that the Fermi reduction mystery
> >> should be solved. Thanks very much for making this happen, Tomasz!
> >>
> >> The fix is already in git, and I've also released 0.94.2 to make sure
> >> that as many people as possible get the fixed code.
> >>
> >>> I would like to know if it works to know how to proceed with
> >>> PyCUDA packaging for Debian. CUDA toolkit is waiting to be
> >>> included, and as soon as it is accepted into Debian I intend
> >>> to ask for sponsorship for PyCUDA packages.
> >>> I am not sure, however, if I should leave PyCUDA as is (and
> >>> risk filling bugs by with Fermi GPUs) or to apply untested
> >>> patch, and risk that it does not work fully/has some side effects.
> >>
> >> I hope the above solves your packaging dilemma, too.
> >>
> >> Andreas
> >>
> >>
> >> _______________________________________________
> >> PyCUDA mailing list
> >> PyCUDA@tiker.net
> >> http://lists.tiker.net/listinfo/pycuda
> >>
> >>
> >
> >
> >
> > --
> > Ian Ozsvald (A.I. researcher, screencaster)
> > i...@ianozsvald.com
> >
> > http://IanOzsvald.com
> > http://MorConsulting.com/
> > http://blog.AICookbook.com/
> > http://TheScreencastingHandbook.com
> > http://FivePoundApp.com/
> > http://twitter.com/IanOzsvald
> >
> > _______________________________________________
> > PyCUDA mailing list
> > PyCUDA@tiker.net
> > http://lists.tiker.net/listinfo/pycuda
> >
>
> _______________________________________________
> PyCUDA mailing list
> PyCUDA@tiker.net
> http://lists.tiker.net/listinfo/pycuda
>
_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to