On 02.07.2012 17:41, Jerome Glisse wrote: > On Fri, Jun 29, 2012 at 12:15 PM, Michel D?nzer <michel at daenzer.net> wrote: >> On Fre, 2012-06-29 at 17:18 +0200, Christian K?nig wrote: >>> On 29.06.2012 17:09, Michel D?nzer wrote: >>>> On Fre, 2012-06-29 at 16:45 +0200, Christian K?nig wrote: >>>>> Hold the ring lock the whole time the reset is in progress, >>>>> otherwise another process can submit new jobs. >>>> Sounds good, but doesn't this create other paths (e.g. initialization, >>>> resume) where the ring is being accessed without holding the lock? Isn't >>>> that a problem? >>> Thought about that also. >>> >>> For init I'm pretty sure that no application can submit commands before >>> we are done, otherwise we are doomed anyway. >>> >>> For resume I'm not really sure, but I think that applications are >>> resumed after the hardware driver had a chance of doing so. >> I hope you're right... but if it's not too much trouble, it might be >> better to be safe than sorry and take the lock for those paths as well. >> >> > NAK this is the wrong way to solve the issue, we need a global lock on > all path that can trigger gpu activities. Previously it was the cs > mutex, but i haven't thought about it too much when it got removed. So > to fix the situation i am sending a patch with rw semaphore. So what I'm missing? What else can trigger GPU activity when not the rings?
I'm currently working on ring-partial resets and also resets where you only skip over a single faulty IB instead of flushing the whole ring. And my current idea for that to work is that we hold the ring lock while we do suspend, ring_save, asic_reset, resume and ring_restore. Christian. > > Cheers, > Jerome >