Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Thu, 05 Jan 2012 21:41:34 -0800, Keith Packard kei...@keithp.com wrote: I talked with Eric about this and we decided that the whole splitting out of the i/o functions just doesn't make any sense. That makes this series very similar to Daniel's patches, so I'll rebase my bug fixes on top of those changes and send out a (shorter) series tomorrow. And here's an updated version of my patches built on top of Daniel's: The following changes since commit d06d2756a21a0c666f167ae9e4f13ef5f07f67d9: acpi/video: Don't restore backlight to 0 at boot time (2012-01-06 11:10:25 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux forcewake-spinlock Daniel Vetter (1): drm/i915: protect force_wake_(get|put) with the gt_lock Daniel's patch (v3) Keith Packard (3): drm/i915: Move reset forcewake processing to gen6_do_reset This moves the forcewake code inside gen6_do_reset, at the same time it changes from unconditionally calling __gen6_gt_force_wake_get to using dev_priv-display.force_wake_get. That could be broken out as a separate patch -- it's just a bug. drm/i915: Hold gt_lock during reset drm/i915: Hold gt_lock across forcewake register reads These two patches eliminate a race between chip reset and other read operations. By holding the gt_lock during all read operations, as well as across reset, we can ensure that forcewake is active for all register reads. Otherwise, right after chip reset, forcewake can be inactive, but the internal forcewake_count may be non-zero. As a nice side-effect, this eliminates taking the gt_lock twice during all register reads. Please take a look and see if these are all reasonable additions to the original patch and when it's ready, I'll push the whole sequence to drm-intel-fixes. -- keith.pack...@intel.com pgpAuvzRPjx4m.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Wed, Jan 04, 2012 at 06:22:41PM -0800, Keith Packard wrote: On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter dan...@ffwll.ch wrote: The Correct? was just to check my understanding of your concern, I still think its invalid. You've cut away the second part of my mail where I explain why and I don't see you responding to that here. Also micro-optimizing the gpu reset code sounds a bit strange. Sorry, I didn't explain things very well. Right now, our register access looks like: get(struct_mutex); if (++forcewake_count == 1) force_wake_get() value = read32(reg) or write32(reg, val) if (--forcewake_count == 0) force_wake_put(); /* more register accesses may follow ... */ put(struct_mutex); All very sensible, the whole register sequence is covered by struct_mutex, which ensures that the forcewake is set across the register access. The patch does: get(spin_lock) if (++forcewake_count == 1) force_wake_get() put(spin_lock) value = read32(reg) or write32(reg, val) get(spin_lock) if (--forcewake_count == 0) force_wake_put() put(spin_lock) I realize that all current users hold the struct_mutex across this whole sequence, aside from the reset path, but we're removing that requirement explicitly (the patch removes the WARN_ON calls). Without a lock held across the whole sequence, it's easy to see how a race could occur. We're also dropping and re-acquiring a spinlock with a single instruction between, which seems wasteful. Instead, we should be doing: get(spin_lock) force_wake_get(); value = read32(reg) or write32(reg,val) force_wake_put(); put(spin_lock); No need here to deal with the wake lock at reset time; the whole operation is atomic wrt interrupts. It's more efficient *and* correct, without depending on the old struct_mutex locking. If you want to continue to expose the user-mode wake lock stuff, you could add: get(spin_lock); if (!forcewake_count) force_wake_get(); value = read32(reg) or write32(reg,val) if (!forcewake_count) force_wake_put(); put(spin_lock); This would require that the reset code also deal with the forcewake_count to restore the user-mode force wake. A further optimization would hold the force_wake active for 'a while' to avoid the extra bus cycles required, but we'd want to see a performance problem from this before doing that. I think you've lost me ... A few random comments first, it looks like I neeed more coffee: - The reset code (running from a workqueue) does hold sturct mutex. It's the hangcheck and error state capture code running from softirq/timer context causing issues. - Forcewake works like a reference counted resources. As long as all _get/_put calls are properly balanced, I don't see how somebody could sneak in in between (when we don't hold the spinlock) and cause havoc. For paranaoia we might want to drop a WARN_ON in the _put call to check whether it ever drops below 0. But all current users are clearly balanced, so I didn't bother with it. - My missed IRQ w/a actually relies on this by grabbing a forcewake ref in get_irq and dropping it again in put_irq. In between there's usually a schedule(). - I've pondered with Chris whether we should do your proposed optimization but we then noticed that the gem code doesn't actually read from any forcewake protected registers in normal execution (I don't consider running the hangcheck timer normal ;-). With my missed irq w/a that now changes, so we might need to reconsider this. But imo that's material for a separate patch. Yours, Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Thu, 5 Jan 2012 12:29:08 +0100, Daniel Vetter dan...@ffwll.ch wrote: - The reset code (running from a workqueue) does hold sturct mutex. It's the hangcheck and error state capture code running from softirq/timer context causing issues. Right, I mis-wrote; I meant the hangcheck timer (which I always think of as part of the reset code). - Forcewake works like a reference counted resources. As long as all _get/_put calls are properly balanced, I don't see how somebody could sneak in in between (when we don't hold the spinlock) and cause havoc. For paranaoia we might want to drop a WARN_ON in the _put call to check whether it ever drops below 0. But all current users are clearly balanced, so I didn't bother with it. Right, I was just confused somehow. Still seems weird to me to drop a spinlock, execute a single instruction, and then immediately re-acquire it, along with bumping forcewake_count twice. - My missed IRQ w/a actually relies on this by grabbing a forcewake ref in get_irq and dropping it again in put_irq. In between there's usually a schedule(). This is essentially the same as the user-level forcewake and would be handled in the same way -- keep forcewake_count, but use it only for long-term values. - I've pondered with Chris whether we should do your proposed optimization but we then noticed that the gem code doesn't actually read from any forcewake protected registers in normal execution (I don't consider running the hangcheck timer normal ;-). With my missed irq w/a that now changes, so we might need to reconsider this. But imo that's material for a separate patch. Yeah, all sounds reasonable. That separate patch can actually use per-chip functions to read/write from the chip so we can also avoid checking the forcewake stuff on register reads for older generation hardware. Make it work, then make it work faster. -- keith.pack...@intel.com pgpIwsiibUbiB.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
Looks like we managed to clear up our mutual confusion here ;-) On Thu, Jan 05, 2012 at 07:49:12AM -0800, Keith Packard wrote: On Thu, 5 Jan 2012 12:29:08 +0100, Daniel Vetter dan...@ffwll.ch wrote: - The reset code (running from a workqueue) does hold sturct mutex. It's the hangcheck and error state capture code running from softirq/timer context causing issues. Right, I mis-wrote; I meant the hangcheck timer (which I always think of as part of the reset code). - Forcewake works like a reference counted resources. As long as all _get/_put calls are properly balanced, I don't see how somebody could sneak in in between (when we don't hold the spinlock) and cause havoc. For paranaoia we might want to drop a WARN_ON in the _put call to check whether it ever drops below 0. But all current users are clearly balanced, so I didn't bother with it. Right, I was just confused somehow. Still seems weird to me to drop a spinlock, execute a single instruction, and then immediately re-acquire it, along with bumping forcewake_count twice. Absolutely agreed, it's really ugly. But especially for locking changes I'd like a patch to do one thing, and one thing only. And I didn't see the upside of a separate patch to fix things up, also because the current I915_WRTE|READ macro maze is a bit hellish. - My missed IRQ w/a actually relies on this by grabbing a forcewake ref in get_irq and dropping it again in put_irq. In between there's usually a schedule(). This is essentially the same as the user-level forcewake and would be handled in the same way -- keep forcewake_count, but use it only for long-term values. - I've pondered with Chris whether we should do your proposed optimization but we then noticed that the gem code doesn't actually read from any forcewake protected registers in normal execution (I don't consider running the hangcheck timer normal ;-). With my missed irq w/a that now changes, so we might need to reconsider this. But imo that's material for a separate patch. Yeah, all sounds reasonable. That separate patch can actually use per-chip functions to read/write from the chip so we can also avoid checking the forcewake stuff on register reads for older generation hardware. Make it work, then make it work faster. Absolutely agreed, maybe with the adadendum to only try to make things faster if it's actually a problem and shows up in a fast-path we care about. Cheers, Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Thu, 5 Jan 2012 17:59:47 +0100, Daniel Vetter dan...@ffwll.ch wrote: Absolutely agreed, maybe with the adadendum to only try to make things faster if it's actually a problem and shows up in a fast-path we care about. Here's a longer series that does a bunch of cleanup before trying to fix things. Patches marked with '***' fix bugs. The patch marked with '...' is the optimization to inline the spinlocks. The following changes since commit d8e70a254d8f2da141006e496a51502b79115e80: drm/i915: only set the intel_crtc DPMS mode to on if the mode set succeeded (2012-01-03 14:55:52 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux forcewake-spinlock Keith Packard (9): drm/i915: Split register access functions out from display functions The forcewake functions are invoked unconditionally on = gen6 hardware from the register read/write functions. Having these initialized as a side-effect of display initialization seems wrong to me. I've moved the functions out of the display structure and into a separate structure, and moved the initialization to driver load time. drm/i915: Access registers through function pointers This makes register access go through function pointers, following similar changes in many other parts of the driver. drm/i915: Split out reg read/write for pre/post gen6 hardware Taking advantage of the previous indirection, this actually creates separate register read/write functions for pre-gen6 and post-gen6 hardware. drm/i915: Move forcewake_count to reg_access structure Just moves the count into the new structure to keep things together. drm/i915: hide forcewake_count behind i915_forcewake_count Create a function to hide getting the forcewake_count value drm/i915: Switch forcewake from atomic to using a spinlock This changes the type from atomic to u32 and wraps all users in a new spinlock. The spinlock is held across calls to -force_wake_put and -force_wake_get. *** drm/i915: Hold forcewake spinlock across reset process Changes the reset process to hold the spinlock -- this will ensure that all register operations will be correct wrt the spinlock, even if the hardware gets reset. *** drm/i915: Hold forcewake spinlock during register write operations This protects the gt_fifo_count value under the spinlock and keeps modifications to that tied to the actual register write. ... drm/i915: inline spin_lock usage in register read macros Here's the optimization I mentioned -- inlines the spinlocks inside the register read operations. -- keith.pack...@intel.com pgpO1LijxUca4.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Thu, 05 Jan 2012 16:29:43 -0800, Keith Packard kei...@keithp.com wrote: Here's a longer series that does a bunch of cleanup before trying to fix things. Patches marked with '***' fix bugs. The patch marked with '...' is the optimization to inline the spinlocks. I talked with Eric about this and we decided that the whole splitting out of the i/o functions just doesn't make any sense. That makes this series very similar to Daniel's patches, so I'll rebase my bug fixes on top of those changes and send out a (shorter) series tomorrow. -- keith.pack...@intel.com pgp2ElK9xiAiW.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Wed, Jan 4, 2012 at 00:33, Keith Packard kei...@keithp.com wrote: On Tue, 3 Jan 2012 22:49:52 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: Nope, current hangcheck blows up, and we have an i-g-t testcase for it (which the commit msg clearly states). There are also numerous bug reports where a dying gpu results in tons of WARN_ON(!mutex_locked(dev-struct_mutex)) noise in dmesg (which drowns out the gpu hang warning). The locking change fixes this. Ah, ok, that makes sense. Of course, hangcheck *could* have just taken struct_mutex were it run in a suitable context. Nope, we cannot move the hangcheck into process context by using a delayed work item and then grabbing struct_mutex. If the gpu is dead, we usually have a task stuck waiting for it and already holding struct_mutex. It is *absolutely* imperial that the hangcheck and error state capture code do not block on anything that the i915 gem code might hold onto. The patch adds the required locking to i915_reset. No, the spinlock protects the forcewake_count access and not the actual register access, which leaves all kinds of potential for races in threads not also holding struct_mutex while accessing registers. Ah, I think I see you're concern: Between the time we reset the gpu and the time we fix up the forcewake state somebody might sneak in and see an inconstency between our tracking and the actual hw state, hence reading garbage. Correct? If you want a spinlock to protect the register access, it must surround the whole operation. Between the time the hangcheck declares the gpu dead and the time we deem it officially resurrected at the end of i915_reset there's no issue with returning garbage from register writes - after all, the gpu just went down. The only thing we have to take care of is that we don't leave behind an inconsistent state after i915_reset, which the current locking in my patch takes care of. Hence I think that no further protection is required. -Daniel -- Daniel Vetter daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Wed, Jan 04, 2012 at 09:54:08AM -0800, Keith Packard wrote: On Wed, 4 Jan 2012 18:11:18 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: Ah, I think I see you're concern: Between the time we reset the gpu and the time we fix up the forcewake state somebody might sneak in and see an inconstency between our tracking and the actual hw state, hence reading garbage. Correct? Indeed. Plus, holding the spinlock across the whole operation also means only taking it once, rather than twice. Spinlocks aren't free. If we change the locking from struct_mutex to the spinlock, we should actually make it work, independent of what access we have today. The Correct? was just to check my understanding of your concern, I still think its invalid. You've cut away the second part of my mail where I explain why and I don't see you responding to that here. Also micro-optimizing the gpu reset code sounds a bit strange. -Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter dan...@ffwll.ch wrote: The Correct? was just to check my understanding of your concern, I still think its invalid. You've cut away the second part of my mail where I explain why and I don't see you responding to that here. Also micro-optimizing the gpu reset code sounds a bit strange. Sorry, I didn't explain things very well. Right now, our register access looks like: get(struct_mutex); if (++forcewake_count == 1) force_wake_get() value = read32(reg) or write32(reg, val) if (--forcewake_count == 0) force_wake_put(); /* more register accesses may follow ... */ put(struct_mutex); All very sensible, the whole register sequence is covered by struct_mutex, which ensures that the forcewake is set across the register access. The patch does: get(spin_lock) if (++forcewake_count == 1) force_wake_get() put(spin_lock) value = read32(reg) or write32(reg, val) get(spin_lock) if (--forcewake_count == 0) force_wake_put() put(spin_lock) I realize that all current users hold the struct_mutex across this whole sequence, aside from the reset path, but we're removing that requirement explicitly (the patch removes the WARN_ON calls). Without a lock held across the whole sequence, it's easy to see how a race could occur. We're also dropping and re-acquiring a spinlock with a single instruction between, which seems wasteful. Instead, we should be doing: get(spin_lock) force_wake_get(); value = read32(reg) or write32(reg,val) force_wake_put(); put(spin_lock); No need here to deal with the wake lock at reset time; the whole operation is atomic wrt interrupts. It's more efficient *and* correct, without depending on the old struct_mutex locking. If you want to continue to expose the user-mode wake lock stuff, you could add: get(spin_lock); if (!forcewake_count) force_wake_get(); value = read32(reg) or write32(reg,val) if (!forcewake_count) force_wake_put(); put(spin_lock); This would require that the reset code also deal with the forcewake_count to restore the user-mode force wake. A further optimization would hold the force_wake active for 'a while' to avoid the extra bus cycles required, but we'd want to see a performance problem from this before doing that. -- keith.pack...@intel.com pgphx3pZKZy5W.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Wed, 14 Dec 2011 13:57:03 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: The problem this patch solves is that the forcewake accounting necessary for register reads is protected by dev-struct_mutex. But the hangcheck and error_capture code need to access registers without grabbing this mutex because we hold it while waiting for the gpu. So a new lock is required. Because currently the error_state capture is called from the error irq handler and the hangcheck code runs from a timer, it needs to be an irqsafe spinlock (note that the registers used by the irq handler (neglecting the error handling part) only uses registers that don't need the forcewake dance). I think this description is wrong -- the only difference between using atomic objects and using a spinlock is that with the spinlock the call to -force_wake_get is correctly serialized so that no register access can occur without the chip being awoken. Without a spinlock, a second thread can pass right through gen6_gt_force_wake_get and then go touch registers while the first thread is busy waking the chip up. -- keith.pack...@intel.com pgpuWLYZdZV8S.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Tue, Jan 3, 2012 at 19:51, Keith Packard kei...@keithp.com wrote: On Wed, 14 Dec 2011 13:57:03 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: The problem this patch solves is that the forcewake accounting necessary for register reads is protected by dev-struct_mutex. But the hangcheck and error_capture code need to access registers without grabbing this mutex because we hold it while waiting for the gpu. So a new lock is required. Because currently the error_state capture is called from the error irq handler and the hangcheck code runs from a timer, it needs to be an irqsafe spinlock (note that the registers used by the irq handler (neglecting the error handling part) only uses registers that don't need the forcewake dance). I think this description is wrong -- the only difference between using atomic objects and using a spinlock is that with the spinlock the call to -force_wake_get is correctly serialized so that no register access can occur without the chip being awoken. Without a spinlock, a second thread can pass right through gen6_gt_force_wake_get and then go touch registers while the first thread is busy waking the chip up. I'm a bit confused by this. With the current code forcewake is protected by dev-struct_mutex. Which doesn't work out because we need to access registers that require the forcewake dance from non-process context. Afaik the atomic ops stuff is just ducttape for paranoia reasons. -Daniel -- Daniel Vetter daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: I'm a bit confused by this. With the current code forcewake is protected by dev-struct_mutex. Which doesn't work out because we need to access registers that require the forcewake dance from non-process context. Right, I like adding a spinlock around this to allow it to be called without needing to be able to lock the struct_mutex. (I remember suggesting that a spinlock would be necessary when the force wake code first showed up...) However, the commit message talks about the error capture and hang check code, but doesn't appear to change them at all. I think all this patch does is replace the locking for forcewake_count From struct_mutex to a new irq-safe spinlock, the commit message makes it sound like it's actually fixing stuff, which it isn't; it just enables fixing stuff in future patches, right? Reading through this a bit more, I think your patch opens up a hole in i915_reset. i915_reset takes struct_mutex, then resets the chip and restores the forcewake status. If we aren't relying on struct_mutex to protect the forcewake bits, then there's nothing preventing a thread From accessing the registers with the chip sleeping between the reset and the force wake reset. Afaik the atomic ops stuff is just ducttape for paranoia reasons. The atomic ops stuff would allow reading of the value without holding struct_mutex, if that were actually useful. -- keith.pack...@intel.com pgpXJ9jHvesmI.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On 01/03/2012 01:13 PM, Keith Packard wrote: On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: I'm a bit confused by this. With the current code forcewake is protected by dev-struct_mutex. Which doesn't work out because we need to access registers that require the forcewake dance from non-process context. Right, I like adding a spinlock around this to allow it to be called without needing to be able to lock the struct_mutex. (I remember suggesting that a spinlock would be necessary when the force wake code first showed up...) However, the commit message talks about the error capture and hang check code, but doesn't appear to change them at all. I think all this patch does is replace the locking for forcewake_count From struct_mutex to a new irq-safe spinlock, the commit message makes it sound like it's actually fixing stuff, which it isn't; it just enables fixing stuff in future patches, right? As Daniel mentioned in the commit message, it fixes existing bugs simply by using a spinlock. In the timer, we do not grab struct_mutex and there is currently a race there (which we've known about since day 1). Afaik the atomic ops stuff is just ducttape for paranoia reasons. The atomic ops stuff would allow reading of the value without holding struct_mutex, if that were actually useful. The atomic ops stuff was simply there to help reduce the races (even if we don't have the lock, we can still safely increment the variable). It should be safe to get rid of with the spinlock in place. My only gripe here is Chris shot down my earlier version of this patch many moons ago :( Ben ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Tue, Jan 3, 2012 at 22:13, Keith Packard kei...@keithp.com wrote: On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: I'm a bit confused by this. With the current code forcewake is protected by dev-struct_mutex. Which doesn't work out because we need to access registers that require the forcewake dance from non-process context. Right, I like adding a spinlock around this to allow it to be called without needing to be able to lock the struct_mutex. (I remember suggesting that a spinlock would be necessary when the force wake code first showed up...) However, the commit message talks about the error capture and hang check code, but doesn't appear to change them at all. I think all this patch does is replace the locking for forcewake_count From struct_mutex to a new irq-safe spinlock, the commit message makes it sound like it's actually fixing stuff, which it isn't; it just enables fixing stuff in future patches, right? Nope, current hangcheck blows up, and we have an i-g-t testcase for it (which the commit msg clearly states). There are also numerous bug reports where a dying gpu results in tons of WARN_ON(!mutex_locked(dev-struct_mutex)) noise in dmesg (which drowns out the gpu hang warning). The locking change fixes this. Reading through this a bit more, I think your patch opens up a hole in i915_reset. i915_reset takes struct_mutex, then resets the chip and restores the forcewake status. If we aren't relying on struct_mutex to protect the forcewake bits, then there's nothing preventing a thread From accessing the registers with the chip sleeping between the reset and the force wake reset. The patch adds the required locking to i915_reset. Afaik the atomic ops stuff is just ducttape for paranoia reasons. The atomic ops stuff would allow reading of the value without holding struct_mutex, if that were actually useful. ... but is currently unused and inherently racy. Which is why the patch drops it. -- Daniel Vetter daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Tue, 03 Jan 2012 13:49:36 -0800, Ben Widawsky b...@bwidawsk.net wrote: The atomic ops stuff was simply there to help reduce the races (even if we don't have the lock, we can still safely increment the variable). It should be safe to get rid of with the spinlock in place. My only gripe here is Chris shot down my earlier version of this patch many moons ago :( The other way of tackling it would be not to take the forcewake during hangcheck at all, and engineer the hangcheck not to rely on the ring reads. For example, use seqno as the primary activity monitor, which only leaves the case of trying not to fire spuriously during a long batchbuffer. To counter that, you could optimistically do a raw read of ACTHD or simply rely on long timeouts. Any error recover should be moved to the error handling workqueue, so that we never attempt to write a register or modify the stuct without the struct_mutex. Reducing the granularity of struct_mutex and solving the contention with mode_config.lock over register access is the ultimate goal when reviewing the locking mess. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
On Tue, 3 Jan 2012 22:49:52 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: Nope, current hangcheck blows up, and we have an i-g-t testcase for it (which the commit msg clearly states). There are also numerous bug reports where a dying gpu results in tons of WARN_ON(!mutex_locked(dev-struct_mutex)) noise in dmesg (which drowns out the gpu hang warning). The locking change fixes this. Ah, ok, that makes sense. Of course, hangcheck *could* have just taken struct_mutex were it run in a suitable context. The patch adds the required locking to i915_reset. No, the spinlock protects the forcewake_count access and not the actual register access, which leaves all kinds of potential for races in threads not also holding struct_mutex while accessing registers. If you want a spinlock to protect the register access, it must surround the whole operation. -- keith.pack...@intel.com pgpgrBG3ePrUe.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock
The problem this patch solves is that the forcewake accounting necessary for register reads is protected by dev-struct_mutex. But the hangcheck and error_capture code need to access registers without grabbing this mutex because we hold it while waiting for the gpu. So a new lock is required. Because currently the error_state capture is called from the error irq handler and the hangcheck code runs from a timer, it needs to be an irqsafe spinlock (note that the registers used by the irq handler (neglecting the error handling part) only uses registers that don't need the forcewake dance). We could tune this down to a normal spinlock when we rework the error_state capture and hangcheck code to run from a workqueue. But we don't have any read in a fastpath that needs forcewake, so I've decided to not care much about overhead. This prevents tests/gem_hangcheck_forcewake from i-g-t from killing my snb on recent kernels - something must have slightly changed the timings. On previous kernels it only trigger a WARN about the broken locking. v2: Drop the previous patch for the register writes. v3: Improve the commit message per Chris Wilson's suggestions. Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com --- drivers/gpu/drm/i915/i915_debugfs.c |8 ++-- drivers/gpu/drm/i915/i915_dma.c |1 + drivers/gpu/drm/i915/i915_drv.c | 18 -- drivers/gpu/drm/i915/i915_drv.h | 10 +++--- 4 files changed, 26 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 60e8092..c130c5d 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1321,9 +1321,13 @@ static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data) struct drm_info_node *node = (struct drm_info_node *) m-private; struct drm_device *dev = node-minor-dev; struct drm_i915_private *dev_priv = dev-dev_private; + unsigned forcewake_count; - seq_printf(m, forcewake count = %d\n, - atomic_read(dev_priv-forcewake_count)); + spin_lock_irq(dev_priv-gt_lock); + forcewake_count = dev_priv-forcewake_count; + spin_unlock_irq(dev_priv-gt_lock); + + seq_printf(m, forcewake count = %u\n, forcewake_count); return 0; } diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index a9533c5..448d5b1 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -2032,6 +2032,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) if (!IS_I945G(dev) !IS_I945GM(dev)) pci_enable_msi(dev-pdev); + spin_lock_init(dev_priv-gt_lock); spin_lock_init(dev_priv-irq_lock); spin_lock_init(dev_priv-error_lock); spin_lock_init(dev_priv-rps_lock); diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 28836fe..34f5115 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -368,11 +368,12 @@ void __gen6_gt_force_wake_mt_get(struct drm_i915_private *dev_priv) */ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv) { - WARN_ON(!mutex_is_locked(dev_priv-dev-struct_mutex)); + unsigned long irqflags; - /* Forcewake is atomic in case we get in here without the lock */ - if (atomic_add_return(1, dev_priv-forcewake_count) == 1) + spin_lock_irqsave(dev_priv-gt_lock, irqflags); + if (dev_priv-forcewake_count++ == 0) dev_priv-display.force_wake_get(dev_priv); + spin_unlock_irqrestore(dev_priv-gt_lock, irqflags); } void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv) @@ -392,10 +393,12 @@ void __gen6_gt_force_wake_mt_put(struct drm_i915_private *dev_priv) */ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv) { - WARN_ON(!mutex_is_locked(dev_priv-dev-struct_mutex)); + unsigned long irqflags; - if (atomic_dec_and_test(dev_priv-forcewake_count)) + spin_lock_irqsave(dev_priv-gt_lock, irqflags); + if (--dev_priv-forcewake_count == 0) dev_priv-display.force_wake_put(dev_priv); + spin_unlock_irqrestore(dev_priv-gt_lock, irqflags); } void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv) @@ -626,6 +629,7 @@ int i915_reset(struct drm_device *dev, u8 flags) * need to */ bool need_display = true; + unsigned long irqflags; int ret; if (!i915_try_reset) @@ -644,8 +648,10 @@ int i915_reset(struct drm_device *dev, u8 flags) case 6: ret = gen6_do_reset(dev, flags); /* If reset with a user forcewake, try to restore */ - if (atomic_read(dev_priv-forcewake_count)) + spin_lock_irqsave(dev_priv-gt_lock,