Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On 2020-07-28 5:08 a.m., dan...@ffwll.ch wrote: On Mon, Jul 27, 2020 at 10:49:48PM -0400, Kazlauskas, Nicholas wrote: On 2020-07-27 5:32 p.m., Daniel Vetter wrote: On Mon, Jul 27, 2020 at 11:11 PM Mazin Rezk wrote: On Monday, July 27, 2020 4:29 PM, Daniel Vetter wrote: On Mon, Jul 27, 2020 at 9:28 PM Christian König wrote: Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: On 2020-07-27 9:39 a.m., Christian König wrote: Am 27.07.20 um 07:40 schrieb Mazin Rezk: This patch fixes a race condition that causes a use-after-free during amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits are requested and the second one finishes before the first. Essentially, this bug occurs when the following sequence of events happens: 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is deferred to the workqueue. 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is deferred to the workqueue. 3. Commit #2 starts before commit #1, dm_state #1 is used in the commit_tail and commit #2 completes, freeing dm_state #1. 4. Commit #1 starts after commit #2 completes, uses the freed dm_state 1 and dereferences a freelist pointer while setting the context. Well I only have a one mile high view on this, but why don't you let the work items execute in order? That would be better anyway cause this way we don't trigger a cache line ping pong between CPUs. Christian. We use the DRM helpers for managing drm_atomic_commit_state and those helpers internally push non-blocking commit work into the system unbound work queue. Mhm, well if you send those helper atomic commits in the order A,B and they execute it in the order B,A I would call that a bug :) The way it works is it pushes all commits into unbound work queue, but then forces serialization as needed. We do _not_ want e.g. updates on different CRTC to be serialized, that would result in lots of judder. And hw is funny enough that there's all kinds of dependencies. The way you force synchronization is by adding other CRTC state objects. So if DC is busted and can only handle a single update per work item, then I guess you always need all CRTC states and everything will be run in order. But that also totally kills modern multi-screen compositors. Xorg isn't modern, just in case that's not clear :-) Lucking at the code it seems like you indeed have only a single dm state, so yeah global sync is what you'll need as immediate fix, and then maybe fix up DM to not be quite so silly ... or at least only do the dm state stuff when really needed. We could also sprinkle the drm_crtc_commit structure around a bit (it's the glue that provides the synchronization across commits), but since your dm state is global just grabbing all crtc states unconditionally as part of that is probably best. While we could duplicate a copy of that code with nothing but the workqueue changed that isn't something I'd really like to maintain going forward. I'm not talking about duplicating the code, I'm talking about fixing the helpers. I don't know that code well, but from the outside it sounds like a bug there. And executing work items in the order they are submitted is trivial. Had anybody pinged Daniel or other people familiar with the helper code about it? Yeah something is wrong here, and the fix looks horrible :-) Aside, I've also seen some recent discussion flare up about drm_atomic_state_get/put used to paper over some other use-after-free, but this time related to interrupt handlers. Maybe a few rules about that: - dont - especially not when it's interrupt handlers, because you can't call drm_atomic_state_put from interrupt handlers. Instead have an spin_lock_irq to protect the shared date with your interrupt handler, and _copy_ the date over. This is e.g. what drm_crtc_arm_vblank_event does. Nicholas wrote a patch that attempted to resolve the issue by adding every CRTC into the commit to use use the stall checks. [1] While this forces synchronisation on commits, it's kind of a hacky method that may take a toll on performance. Is it possible to have a DRM helper that forces synchronisation on some commits without having to add every CRTC into the commit? Also, is synchronisation really necessary for fast updates in amdgpu? I'll admit, the idea of eliminating the use-after-free bug by eliminating the use entirely doesn't seem ideal; but is forcing synchronisation on these updates that much better? Well clearing the dc_state pointer here and then allocating another one in atomic_commit_tail also looks fishy. The proper fix is probably a lot more involved, but yeah interim fix is to grab all crtc states iff you also grabbed the dm_atomic_state structure. Real fix is to only do this when necessary, which pretty much means the dc_state needs to be somehow split up, or there needs to be some guarantees about when it's necessary and when not. Otherwise parallel commits on different CRTC are not possible with the current dc b
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Monday, July 27, 2020 7:42 PM, Mazin Rezk wrote: > On Monday, July 27, 2020 5:32 PM, Daniel Vetter wrote: > > > On Mon, Jul 27, 2020 at 11:11 PM Mazin Rezk wrote: > > > > > > On Monday, July 27, 2020 4:29 PM, Daniel Vetter wrote: > > > > > > > On Mon, Jul 27, 2020 at 9:28 PM Christian König > > > > wrote: > > > > > > > > > > Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: > > > > > > On 2020-07-27 9:39 a.m., Christian König wrote: > > > > > >> Am 27.07.20 um 07:40 schrieb Mazin Rezk: > > > > > >>> This patch fixes a race condition that causes a use-after-free > > > > > >>> during > > > > > >>> amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking > > > > > >>> commits > > > > > >>> are requested and the second one finishes before the first. > > > > > >>> Essentially, > > > > > >>> this bug occurs when the following sequence of events happens: > > > > > >>> > > > > > >>> 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > > > > > >>> deferred to the workqueue. > > > > > >>> > > > > > >>> 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > > > > > >>> deferred to the workqueue. > > > > > >>> > > > > > >>> 3. Commit #2 starts before commit #1, dm_state #1 is used in the > > > > > >>> commit_tail and commit #2 completes, freeing dm_state #1. > > > > > >>> > > > > > >>> 4. Commit #1 starts after commit #2 completes, uses the freed > > > > > >>> dm_state > > > > > >>> 1 and dereferences a freelist pointer while setting the context. > > > > > >> > > > > > >> Well I only have a one mile high view on this, but why don't you > > > > > >> let > > > > > >> the work items execute in order? > > > > > >> > > > > > >> That would be better anyway cause this way we don't trigger a cache > > > > > >> line ping pong between CPUs. > > > > > >> > > > > > >> Christian. > > > > > > > > > > > > We use the DRM helpers for managing drm_atomic_commit_state and > > > > > > those > > > > > > helpers internally push non-blocking commit work into the system > > > > > > unbound work queue. > > > > > > > > > > Mhm, well if you send those helper atomic commits in the order A,B and > > > > > they execute it in the order B,A I would call that a bug :) > > > > > > > > The way it works is it pushes all commits into unbound work queue, but > > > > then forces serialization as needed. We do _not_ want e.g. updates on > > > > different CRTC to be serialized, that would result in lots of judder. > > > > And hw is funny enough that there's all kinds of dependencies. > > > > > > > > The way you force synchronization is by adding other CRTC state > > > > objects. So if DC is busted and can only handle a single update per > > > > work item, then I guess you always need all CRTC states and everything > > > > will be run in order. But that also totally kills modern multi-screen > > > > compositors. Xorg isn't modern, just in case that's not clear :-) > > > > > > > > Lucking at the code it seems like you indeed have only a single dm > > > > state, so yeah global sync is what you'll need as immediate fix, and > > > > then maybe fix up DM to not be quite so silly ... or at least only do > > > > the dm state stuff when really needed. > > > > > > > > We could also sprinkle the drm_crtc_commit structure around a bit > > > > (it's the glue that provides the synchronization across commits), but > > > > since your dm state is global just grabbing all crtc states > > > > unconditionally as part of that is probably best. > > > > > > > > > > While we could duplicate a copy of that code with nothing but the > > > > > > workqueue changed that isn't something I'd really like to maintain > > > > > > going forward. > > > > > > > > > > I'm not talking about duplicating the code, I'm talking about fixing > > > > > the > > > > > helpers. I don't know that code well, but from the outside it sounds > > > > > like a bug there. > > > > > > > > > > And executing work items in the order they are submitted is trivial. > > > > > > > > > > Had anybody pinged Daniel or other people familiar with the helper > > > > > code > > > > > about it? > > > > > > > > Yeah something is wrong here, and the fix looks horrible :-) > > > > > > > > Aside, I've also seen some recent discussion flare up about > > > > drm_atomic_state_get/put used to paper over some other use-after-free, > > > > but this time related to interrupt handlers. Maybe a few rules about > > > > that: > > > > - dont > > > > - especially not when it's interrupt handlers, because you can't call > > > > drm_atomic_state_put from interrupt handlers. > > > > > > > > Instead have an spin_lock_irq to protect the shared date with your > > > > interrupt handler, and _copy_ the date over. This is e.g. what > > > > drm_crtc_arm_vblank_event does. > > > > > > Nicholas wrote a patch that attempted to resolve the issue by adding every > > > CRTC into the commit to use use the stall checks. [1] While this forces > > > synchronisation on commits, it's kind of a hacky method that ma
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Monday, July 27, 2020 4:29 PM, Daniel Vetter wrote: > On Mon, Jul 27, 2020 at 9:28 PM Christian König > wrote: > > > > Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: > > > On 2020-07-27 9:39 a.m., Christian König wrote: > > >> Am 27.07.20 um 07:40 schrieb Mazin Rezk: > > >>> This patch fixes a race condition that causes a use-after-free during > > >>> amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking > > >>> commits > > >>> are requested and the second one finishes before the first. > > >>> Essentially, > > >>> this bug occurs when the following sequence of events happens: > > >>> > > >>> 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > > >>> deferred to the workqueue. > > >>> > > >>> 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > > >>> deferred to the workqueue. > > >>> > > >>> 3. Commit #2 starts before commit #1, dm_state #1 is used in the > > >>> commit_tail and commit #2 completes, freeing dm_state #1. > > >>> > > >>> 4. Commit #1 starts after commit #2 completes, uses the freed dm_state > > >>> 1 and dereferences a freelist pointer while setting the context. > > >> > > >> Well I only have a one mile high view on this, but why don't you let > > >> the work items execute in order? > > >> > > >> That would be better anyway cause this way we don't trigger a cache > > >> line ping pong between CPUs. > > >> > > >> Christian. > > > > > > We use the DRM helpers for managing drm_atomic_commit_state and those > > > helpers internally push non-blocking commit work into the system > > > unbound work queue. > > > > Mhm, well if you send those helper atomic commits in the order A,B and > > they execute it in the order B,A I would call that a bug :) > > The way it works is it pushes all commits into unbound work queue, but > then forces serialization as needed. We do _not_ want e.g. updates on > different CRTC to be serialized, that would result in lots of judder. > And hw is funny enough that there's all kinds of dependencies. > > The way you force synchronization is by adding other CRTC state > objects. So if DC is busted and can only handle a single update per > work item, then I guess you always need all CRTC states and everything > will be run in order. But that also totally kills modern multi-screen > compositors. Xorg isn't modern, just in case that's not clear :-) > > Lucking at the code it seems like you indeed have only a single dm > state, so yeah global sync is what you'll need as immediate fix, and > then maybe fix up DM to not be quite so silly ... or at least only do > the dm state stuff when really needed. > > We could also sprinkle the drm_crtc_commit structure around a bit > (it's the glue that provides the synchronization across commits), but > since your dm state is global just grabbing all crtc states > unconditionally as part of that is probably best. > > > > While we could duplicate a copy of that code with nothing but the > > > workqueue changed that isn't something I'd really like to maintain > > > going forward. > > > > I'm not talking about duplicating the code, I'm talking about fixing the > > helpers. I don't know that code well, but from the outside it sounds > > like a bug there. > > > > And executing work items in the order they are submitted is trivial. > > > > Had anybody pinged Daniel or other people familiar with the helper code > > about it? > > Yeah something is wrong here, and the fix looks horrible :-) > > Aside, I've also seen some recent discussion flare up about > drm_atomic_state_get/put used to paper over some other use-after-free, > but this time related to interrupt handlers. Maybe a few rules about > that: > - dont > - especially not when it's interrupt handlers, because you can't call > drm_atomic_state_put from interrupt handlers. > > Instead have an spin_lock_irq to protect the shared date with your > interrupt handler, and _copy_ the date over. This is e.g. what > drm_crtc_arm_vblank_event does. Nicholas wrote a patch that attempted to resolve the issue by adding every CRTC into the commit to use use the stall checks. [1] While this forces synchronisation on commits, it's kind of a hacky method that may take a toll on performance. Is it possible to have a DRM helper that forces synchronisation on some commits without having to add every CRTC into the commit? Also, is synchronisation really necessary for fast updates in amdgpu? I'll admit, the idea of eliminating the use-after-free bug by eliminating the use entirely doesn't seem ideal; but is forcing synchronisation on these updates that much better? [1] https://bugzilla.kernel.org/show_bug.cgi?id=207383#c96 Thanks, Mazin Rezk > > Cheers, Daniel > > > > > Regards, > > Christian. > > > > > > > > Regards, > > > Nicholas Kazlauskas > > > > > >> > > >>> > > >>> Since this bug has only been spotted with fast commits, this patch > > >>> fixes > > >>> the bug by clearing the dm_state instead of using the old dc_state for > > >>> fast updates. In add
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Monday, July 27, 2020 5:32 PM, Daniel Vetter wrote: > On Mon, Jul 27, 2020 at 11:11 PM Mazin Rezk wrote: > > > > On Monday, July 27, 2020 4:29 PM, Daniel Vetter wrote: > > > > > On Mon, Jul 27, 2020 at 9:28 PM Christian König > > > wrote: > > > > > > > > Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: > > > > > On 2020-07-27 9:39 a.m., Christian König wrote: > > > > >> Am 27.07.20 um 07:40 schrieb Mazin Rezk: > > > > >>> This patch fixes a race condition that causes a use-after-free > > > > >>> during > > > > >>> amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking > > > > >>> commits > > > > >>> are requested and the second one finishes before the first. > > > > >>> Essentially, > > > > >>> this bug occurs when the following sequence of events happens: > > > > >>> > > > > >>> 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > > > > >>> deferred to the workqueue. > > > > >>> > > > > >>> 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > > > > >>> deferred to the workqueue. > > > > >>> > > > > >>> 3. Commit #2 starts before commit #1, dm_state #1 is used in the > > > > >>> commit_tail and commit #2 completes, freeing dm_state #1. > > > > >>> > > > > >>> 4. Commit #1 starts after commit #2 completes, uses the freed > > > > >>> dm_state > > > > >>> 1 and dereferences a freelist pointer while setting the context. > > > > >> > > > > >> Well I only have a one mile high view on this, but why don't you let > > > > >> the work items execute in order? > > > > >> > > > > >> That would be better anyway cause this way we don't trigger a cache > > > > >> line ping pong between CPUs. > > > > >> > > > > >> Christian. > > > > > > > > > > We use the DRM helpers for managing drm_atomic_commit_state and those > > > > > helpers internally push non-blocking commit work into the system > > > > > unbound work queue. > > > > > > > > Mhm, well if you send those helper atomic commits in the order A,B and > > > > they execute it in the order B,A I would call that a bug :) > > > > > > The way it works is it pushes all commits into unbound work queue, but > > > then forces serialization as needed. We do _not_ want e.g. updates on > > > different CRTC to be serialized, that would result in lots of judder. > > > And hw is funny enough that there's all kinds of dependencies. > > > > > > The way you force synchronization is by adding other CRTC state > > > objects. So if DC is busted and can only handle a single update per > > > work item, then I guess you always need all CRTC states and everything > > > will be run in order. But that also totally kills modern multi-screen > > > compositors. Xorg isn't modern, just in case that's not clear :-) > > > > > > Lucking at the code it seems like you indeed have only a single dm > > > state, so yeah global sync is what you'll need as immediate fix, and > > > then maybe fix up DM to not be quite so silly ... or at least only do > > > the dm state stuff when really needed. > > > > > > We could also sprinkle the drm_crtc_commit structure around a bit > > > (it's the glue that provides the synchronization across commits), but > > > since your dm state is global just grabbing all crtc states > > > unconditionally as part of that is probably best. > > > > > > > > While we could duplicate a copy of that code with nothing but the > > > > > workqueue changed that isn't something I'd really like to maintain > > > > > going forward. > > > > > > > > I'm not talking about duplicating the code, I'm talking about fixing the > > > > helpers. I don't know that code well, but from the outside it sounds > > > > like a bug there. > > > > > > > > And executing work items in the order they are submitted is trivial. > > > > > > > > Had anybody pinged Daniel or other people familiar with the helper code > > > > about it? > > > > > > Yeah something is wrong here, and the fix looks horrible :-) > > > > > > Aside, I've also seen some recent discussion flare up about > > > drm_atomic_state_get/put used to paper over some other use-after-free, > > > but this time related to interrupt handlers. Maybe a few rules about > > > that: > > > - dont > > > - especially not when it's interrupt handlers, because you can't call > > > drm_atomic_state_put from interrupt handlers. > > > > > > Instead have an spin_lock_irq to protect the shared date with your > > > interrupt handler, and _copy_ the date over. This is e.g. what > > > drm_crtc_arm_vblank_event does. > > > > Nicholas wrote a patch that attempted to resolve the issue by adding every > > CRTC into the commit to use use the stall checks. [1] While this forces > > synchronisation on commits, it's kind of a hacky method that may take a > > toll on performance. > > > > Is it possible to have a DRM helper that forces synchronisation on some > > commits without having to add every CRTC into the commit? > > > > Also, is synchronisation really necessary for fast updates in amdgpu? > > I'll admit, the idea of eliminating the us
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Mon, Jul 27, 2020 at 10:49:48PM -0400, Kazlauskas, Nicholas wrote: > On 2020-07-27 5:32 p.m., Daniel Vetter wrote: > > On Mon, Jul 27, 2020 at 11:11 PM Mazin Rezk wrote: > > > > > > On Monday, July 27, 2020 4:29 PM, Daniel Vetter wrote: > > > > > > > On Mon, Jul 27, 2020 at 9:28 PM Christian König > > > > wrote: > > > > > > > > > > Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: > > > > > > On 2020-07-27 9:39 a.m., Christian König wrote: > > > > > > > Am 27.07.20 um 07:40 schrieb Mazin Rezk: > > > > > > > > This patch fixes a race condition that causes a use-after-free > > > > > > > > during > > > > > > > > amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking > > > > > > > > commits > > > > > > > > are requested and the second one finishes before the first. > > > > > > > > Essentially, > > > > > > > > this bug occurs when the following sequence of events happens: > > > > > > > > > > > > > > > > 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and > > > > > > > > is > > > > > > > > deferred to the workqueue. > > > > > > > > > > > > > > > > 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and > > > > > > > > is > > > > > > > > deferred to the workqueue. > > > > > > > > > > > > > > > > 3. Commit #2 starts before commit #1, dm_state #1 is used in the > > > > > > > > commit_tail and commit #2 completes, freeing dm_state #1. > > > > > > > > > > > > > > > > 4. Commit #1 starts after commit #2 completes, uses the freed > > > > > > > > dm_state > > > > > > > > 1 and dereferences a freelist pointer while setting the context. > > > > > > > > > > > > > > Well I only have a one mile high view on this, but why don't you > > > > > > > let > > > > > > > the work items execute in order? > > > > > > > > > > > > > > That would be better anyway cause this way we don't trigger a > > > > > > > cache > > > > > > > line ping pong between CPUs. > > > > > > > > > > > > > > Christian. > > > > > > > > > > > > We use the DRM helpers for managing drm_atomic_commit_state and > > > > > > those > > > > > > helpers internally push non-blocking commit work into the system > > > > > > unbound work queue. > > > > > > > > > > Mhm, well if you send those helper atomic commits in the order A,B and > > > > > they execute it in the order B,A I would call that a bug :) > > > > > > > > The way it works is it pushes all commits into unbound work queue, but > > > > then forces serialization as needed. We do _not_ want e.g. updates on > > > > different CRTC to be serialized, that would result in lots of judder. > > > > And hw is funny enough that there's all kinds of dependencies. > > > > > > > > The way you force synchronization is by adding other CRTC state > > > > objects. So if DC is busted and can only handle a single update per > > > > work item, then I guess you always need all CRTC states and everything > > > > will be run in order. But that also totally kills modern multi-screen > > > > compositors. Xorg isn't modern, just in case that's not clear :-) > > > > > > > > Lucking at the code it seems like you indeed have only a single dm > > > > state, so yeah global sync is what you'll need as immediate fix, and > > > > then maybe fix up DM to not be quite so silly ... or at least only do > > > > the dm state stuff when really needed. > > > > > > > > We could also sprinkle the drm_crtc_commit structure around a bit > > > > (it's the glue that provides the synchronization across commits), but > > > > since your dm state is global just grabbing all crtc states > > > > unconditionally as part of that is probably best. > > > > > > > > > > While we could duplicate a copy of that code with nothing but the > > > > > > workqueue changed that isn't something I'd really like to maintain > > > > > > going forward. > > > > > > > > > > I'm not talking about duplicating the code, I'm talking about fixing > > > > > the > > > > > helpers. I don't know that code well, but from the outside it sounds > > > > > like a bug there. > > > > > > > > > > And executing work items in the order they are submitted is trivial. > > > > > > > > > > Had anybody pinged Daniel or other people familiar with the helper > > > > > code > > > > > about it? > > > > > > > > Yeah something is wrong here, and the fix looks horrible :-) > > > > > > > > Aside, I've also seen some recent discussion flare up about > > > > drm_atomic_state_get/put used to paper over some other use-after-free, > > > > but this time related to interrupt handlers. Maybe a few rules about > > > > that: > > > > - dont > > > > - especially not when it's interrupt handlers, because you can't call > > > > drm_atomic_state_put from interrupt handlers. > > > > > > > > Instead have an spin_lock_irq to protect the shared date with your > > > > interrupt handler, and _copy_ the date over. This is e.g. what > > > > drm_crtc_arm_vblank_event does. > > > > > > Nicholas wrote a patch that attempted to resolve the issue by adding every > > > CRTC into the
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On 2020-07-27 5:32 p.m., Daniel Vetter wrote: On Mon, Jul 27, 2020 at 11:11 PM Mazin Rezk wrote: On Monday, July 27, 2020 4:29 PM, Daniel Vetter wrote: On Mon, Jul 27, 2020 at 9:28 PM Christian König wrote: Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: On 2020-07-27 9:39 a.m., Christian König wrote: Am 27.07.20 um 07:40 schrieb Mazin Rezk: This patch fixes a race condition that causes a use-after-free during amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits are requested and the second one finishes before the first. Essentially, this bug occurs when the following sequence of events happens: 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is deferred to the workqueue. 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is deferred to the workqueue. 3. Commit #2 starts before commit #1, dm_state #1 is used in the commit_tail and commit #2 completes, freeing dm_state #1. 4. Commit #1 starts after commit #2 completes, uses the freed dm_state 1 and dereferences a freelist pointer while setting the context. Well I only have a one mile high view on this, but why don't you let the work items execute in order? That would be better anyway cause this way we don't trigger a cache line ping pong between CPUs. Christian. We use the DRM helpers for managing drm_atomic_commit_state and those helpers internally push non-blocking commit work into the system unbound work queue. Mhm, well if you send those helper atomic commits in the order A,B and they execute it in the order B,A I would call that a bug :) The way it works is it pushes all commits into unbound work queue, but then forces serialization as needed. We do _not_ want e.g. updates on different CRTC to be serialized, that would result in lots of judder. And hw is funny enough that there's all kinds of dependencies. The way you force synchronization is by adding other CRTC state objects. So if DC is busted and can only handle a single update per work item, then I guess you always need all CRTC states and everything will be run in order. But that also totally kills modern multi-screen compositors. Xorg isn't modern, just in case that's not clear :-) Lucking at the code it seems like you indeed have only a single dm state, so yeah global sync is what you'll need as immediate fix, and then maybe fix up DM to not be quite so silly ... or at least only do the dm state stuff when really needed. We could also sprinkle the drm_crtc_commit structure around a bit (it's the glue that provides the synchronization across commits), but since your dm state is global just grabbing all crtc states unconditionally as part of that is probably best. While we could duplicate a copy of that code with nothing but the workqueue changed that isn't something I'd really like to maintain going forward. I'm not talking about duplicating the code, I'm talking about fixing the helpers. I don't know that code well, but from the outside it sounds like a bug there. And executing work items in the order they are submitted is trivial. Had anybody pinged Daniel or other people familiar with the helper code about it? Yeah something is wrong here, and the fix looks horrible :-) Aside, I've also seen some recent discussion flare up about drm_atomic_state_get/put used to paper over some other use-after-free, but this time related to interrupt handlers. Maybe a few rules about that: - dont - especially not when it's interrupt handlers, because you can't call drm_atomic_state_put from interrupt handlers. Instead have an spin_lock_irq to protect the shared date with your interrupt handler, and _copy_ the date over. This is e.g. what drm_crtc_arm_vblank_event does. Nicholas wrote a patch that attempted to resolve the issue by adding every CRTC into the commit to use use the stall checks. [1] While this forces synchronisation on commits, it's kind of a hacky method that may take a toll on performance. Is it possible to have a DRM helper that forces synchronisation on some commits without having to add every CRTC into the commit? Also, is synchronisation really necessary for fast updates in amdgpu? I'll admit, the idea of eliminating the use-after-free bug by eliminating the use entirely doesn't seem ideal; but is forcing synchronisation on these updates that much better? Well clearing the dc_state pointer here and then allocating another one in atomic_commit_tail also looks fishy. The proper fix is probably a lot more involved, but yeah interim fix is to grab all crtc states iff you also grabbed the dm_atomic_state structure. Real fix is to only do this when necessary, which pretty much means the dc_state needs to be somehow split up, or there needs to be some guarantees about when it's necessary and when not. Otherwise parallel commits on different CRTC are not possible with the current dc backend code. Thanks for spending some time to help take a look at this as well. The DRM documentation (at least at th
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Monday, July 27, 2020 9:26 AM, Kazlauskas, Nicholas wrote: > On 2020-07-27 1:40 a.m., Mazin Rezk wrote: > > This patch fixes a race condition that causes a use-after-free during > > amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits > > are requested and the second one finishes before the first. Essentially, > > this bug occurs when the following sequence of events happens: > > > > 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > > deferred to the workqueue. > > > > 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > > deferred to the workqueue. > > > > 3. Commit #2 starts before commit #1, dm_state #1 is used in the > > commit_tail and commit #2 completes, freeing dm_state #1. > > > > 4. Commit #1 starts after commit #2 completes, uses the freed dm_state > > 1 and dereferences a freelist pointer while setting the context. > > > > Since this bug has only been spotted with fast commits, this patch fixes > > the bug by clearing the dm_state instead of using the old dc_state for > > fast updates. In addition, since dm_state is only used for its dc_state > > and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found, > > removing the dm_state should not have any consequences in fast updates. > > > > This use-after-free bug has existed for a while now, but only caused a > > noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate > > freelist pointer to middle of object") moving the freelist pointer from > > dm_state->base (which was unused) to dm_state->context (which is > > dereferenced). > > > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383 > > Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast > > updates") > > Reported-by: Duncan <1i5t5.dun...@cox.net> > > Signed-off-by: Mazin Rezk > > --- > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++- > > 1 file changed, 27 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > index 86ffa0c2880f..710edc70e37e 100644 > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > @@ -8717,20 +8717,38 @@ static int amdgpu_dm_atomic_check(struct drm_device > > *dev, > > * the same resource. If we have a new DC context as part of > > * the DM atomic state from validation we need to free it and > > * retain the existing one instead. > > +* > > +* Furthermore, since the DM atomic state only contains the DC > > +* context and can safely be annulled, we can free the state > > +* and clear the associated private object now to free > > +* some memory and avoid a possible use-after-free later. > > */ > > - struct dm_atomic_state *new_dm_state, *old_dm_state; > > > > - new_dm_state = dm_atomic_get_new_state(state); > > - old_dm_state = dm_atomic_get_old_state(state); > > + for (i = 0; i < state->num_private_objs; i++) { > > + struct drm_private_obj *obj = > > state->private_objs[i].ptr; > > > > - if (new_dm_state && old_dm_state) { > > - if (new_dm_state->context) > > - dc_release_state(new_dm_state->context); > > + if (obj->funcs == adev->dm.atomic_obj.funcs) { > > + int j = state->num_private_objs-1; > > > > - new_dm_state->context = old_dm_state->context; > > + dm_atomic_destroy_state(obj, > > + state->private_objs[i].state); > > + > > + /* If i is not at the end of the array then the > > +* last element needs to be moved to where i was > > +* before the array can safely be truncated. > > +*/ > > + if (i != j) > > + state->private_objs[i] = > > + state->private_objs[j]; > > > > - if (old_dm_state->context) > > - dc_retain_state(old_dm_state->context); > > + state->private_objs[j].ptr = NULL; > > + state->private_objs[j].state = NULL; > > + state->private_objs[j].old_state = NULL; > > + state->private_objs[j].new_state = NULL; > > + > > + state->num_private_objs = j; > > + break; > > + } > > In the bug report itself I mentioned that I don't really like hacking > around the DRM core for resolving this patch but to go into more > specifics, it's really two issues of code maintenanc
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Mon, Jul 27, 2020 at 11:11 PM Mazin Rezk wrote: > > On Monday, July 27, 2020 4:29 PM, Daniel Vetter wrote: > > > On Mon, Jul 27, 2020 at 9:28 PM Christian König > > wrote: > > > > > > Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: > > > > On 2020-07-27 9:39 a.m., Christian König wrote: > > > >> Am 27.07.20 um 07:40 schrieb Mazin Rezk: > > > >>> This patch fixes a race condition that causes a use-after-free during > > > >>> amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking > > > >>> commits > > > >>> are requested and the second one finishes before the first. > > > >>> Essentially, > > > >>> this bug occurs when the following sequence of events happens: > > > >>> > > > >>> 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > > > >>> deferred to the workqueue. > > > >>> > > > >>> 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > > > >>> deferred to the workqueue. > > > >>> > > > >>> 3. Commit #2 starts before commit #1, dm_state #1 is used in the > > > >>> commit_tail and commit #2 completes, freeing dm_state #1. > > > >>> > > > >>> 4. Commit #1 starts after commit #2 completes, uses the freed dm_state > > > >>> 1 and dereferences a freelist pointer while setting the context. > > > >> > > > >> Well I only have a one mile high view on this, but why don't you let > > > >> the work items execute in order? > > > >> > > > >> That would be better anyway cause this way we don't trigger a cache > > > >> line ping pong between CPUs. > > > >> > > > >> Christian. > > > > > > > > We use the DRM helpers for managing drm_atomic_commit_state and those > > > > helpers internally push non-blocking commit work into the system > > > > unbound work queue. > > > > > > Mhm, well if you send those helper atomic commits in the order A,B and > > > they execute it in the order B,A I would call that a bug :) > > > > The way it works is it pushes all commits into unbound work queue, but > > then forces serialization as needed. We do _not_ want e.g. updates on > > different CRTC to be serialized, that would result in lots of judder. > > And hw is funny enough that there's all kinds of dependencies. > > > > The way you force synchronization is by adding other CRTC state > > objects. So if DC is busted and can only handle a single update per > > work item, then I guess you always need all CRTC states and everything > > will be run in order. But that also totally kills modern multi-screen > > compositors. Xorg isn't modern, just in case that's not clear :-) > > > > Lucking at the code it seems like you indeed have only a single dm > > state, so yeah global sync is what you'll need as immediate fix, and > > then maybe fix up DM to not be quite so silly ... or at least only do > > the dm state stuff when really needed. > > > > We could also sprinkle the drm_crtc_commit structure around a bit > > (it's the glue that provides the synchronization across commits), but > > since your dm state is global just grabbing all crtc states > > unconditionally as part of that is probably best. > > > > > > While we could duplicate a copy of that code with nothing but the > > > > workqueue changed that isn't something I'd really like to maintain > > > > going forward. > > > > > > I'm not talking about duplicating the code, I'm talking about fixing the > > > helpers. I don't know that code well, but from the outside it sounds > > > like a bug there. > > > > > > And executing work items in the order they are submitted is trivial. > > > > > > Had anybody pinged Daniel or other people familiar with the helper code > > > about it? > > > > Yeah something is wrong here, and the fix looks horrible :-) > > > > Aside, I've also seen some recent discussion flare up about > > drm_atomic_state_get/put used to paper over some other use-after-free, > > but this time related to interrupt handlers. Maybe a few rules about > > that: > > - dont > > - especially not when it's interrupt handlers, because you can't call > > drm_atomic_state_put from interrupt handlers. > > > > Instead have an spin_lock_irq to protect the shared date with your > > interrupt handler, and _copy_ the date over. This is e.g. what > > drm_crtc_arm_vblank_event does. > > Nicholas wrote a patch that attempted to resolve the issue by adding every > CRTC into the commit to use use the stall checks. [1] While this forces > synchronisation on commits, it's kind of a hacky method that may take a > toll on performance. > > Is it possible to have a DRM helper that forces synchronisation on some > commits without having to add every CRTC into the commit? > > Also, is synchronisation really necessary for fast updates in amdgpu? > I'll admit, the idea of eliminating the use-after-free bug by eliminating > the use entirely doesn't seem ideal; but is forcing synchronisation on > these updates that much better? Well clearing the dc_state pointer here and then allocating another one in atomic_commit_tail also looks fishy. The proper fix is probably a lot mo
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Mon, Jul 27, 2020 at 10:29 PM Daniel Vetter wrote: > > On Mon, Jul 27, 2020 at 9:28 PM Christian König > wrote: > > > > Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: > > > On 2020-07-27 9:39 a.m., Christian König wrote: > > >> Am 27.07.20 um 07:40 schrieb Mazin Rezk: > > >>> This patch fixes a race condition that causes a use-after-free during > > >>> amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking > > >>> commits > > >>> are requested and the second one finishes before the first. > > >>> Essentially, > > >>> this bug occurs when the following sequence of events happens: > > >>> > > >>> 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > > >>> deferred to the workqueue. > > >>> > > >>> 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > > >>> deferred to the workqueue. > > >>> > > >>> 3. Commit #2 starts before commit #1, dm_state #1 is used in the > > >>> commit_tail and commit #2 completes, freeing dm_state #1. > > >>> > > >>> 4. Commit #1 starts after commit #2 completes, uses the freed dm_state > > >>> 1 and dereferences a freelist pointer while setting the context. > > >> > > >> Well I only have a one mile high view on this, but why don't you let > > >> the work items execute in order? > > >> > > >> That would be better anyway cause this way we don't trigger a cache > > >> line ping pong between CPUs. > > >> > > >> Christian. > > > > > > We use the DRM helpers for managing drm_atomic_commit_state and those > > > helpers internally push non-blocking commit work into the system > > > unbound work queue. > > > > Mhm, well if you send those helper atomic commits in the order A,B and > > they execute it in the order B,A I would call that a bug :) > > The way it works is it pushes all commits into unbound work queue, but > then forces serialization as needed. We do _not_ want e.g. updates on > different CRTC to be serialized, that would result in lots of judder. > And hw is funny enough that there's all kinds of dependencies. > > The way you force synchronization is by adding other CRTC state > objects. So if DC is busted and can only handle a single update per > work item, then I guess you always need all CRTC states and everything > will be run in order. But that also totally kills modern multi-screen > compositors. Xorg isn't modern, just in case that's not clear :-) > > Lucking at the code it seems like you indeed have only a single dm > state, so yeah global sync is what you'll need as immediate fix, and > then maybe fix up DM to not be quite so silly ... or at least only do > the dm state stuff when really needed. Just looked a bit more at this struct dc_state, and that looks a lot like an atomic side-wagon. I don't think that works as a private state, this should probably be embedded into a subclass of drm_atomic_state. And probably a lot of these pointers moved to other places I think, or I'm not entirely clear on what exactly this stuff is needed for ... dc_state is also refcounted, which is definitely rather funny for a state structure. Feels like this entire thing (how the overall dc state machinery is glued into atomic) isn't quite thought thru just yet :-/ -Daniel > We could also sprinkle the drm_crtc_commit structure around a bit > (it's the glue that provides the synchronization across commits), but > since your dm state is global just grabbing all crtc states > unconditionally as part of that is probably best. > > > > While we could duplicate a copy of that code with nothing but the > > > workqueue changed that isn't something I'd really like to maintain > > > going forward. > > > > I'm not talking about duplicating the code, I'm talking about fixing the > > helpers. I don't know that code well, but from the outside it sounds > > like a bug there. > > > > And executing work items in the order they are submitted is trivial. > > > > Had anybody pinged Daniel or other people familiar with the helper code > > about it? > > Yeah something is wrong here, and the fix looks horrible :-) > > Aside, I've also seen some recent discussion flare up about > drm_atomic_state_get/put used to paper over some other use-after-free, > but this time related to interrupt handlers. Maybe a few rules about > that: > - dont > - especially not when it's interrupt handlers, because you can't call > drm_atomic_state_put from interrupt handlers. > > Instead have an spin_lock_irq to protect the shared date with your > interrupt handler, and _copy_ the date over. This is e.g. what > drm_crtc_arm_vblank_event does. > > Cheers, Daniel > > > > > Regards, > > Christian. > > > > > > > > Regards, > > > Nicholas Kazlauskas > > > > > >> > > >>> > > >>> Since this bug has only been spotted with fast commits, this patch > > >>> fixes > > >>> the bug by clearing the dm_state instead of using the old dc_state for > > >>> fast updates. In addition, since dm_state is only used for its dc_state > > >>> and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is >
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Mon, Jul 27, 2020 at 9:28 PM Christian König wrote: > > Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: > > On 2020-07-27 9:39 a.m., Christian König wrote: > >> Am 27.07.20 um 07:40 schrieb Mazin Rezk: > >>> This patch fixes a race condition that causes a use-after-free during > >>> amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking > >>> commits > >>> are requested and the second one finishes before the first. > >>> Essentially, > >>> this bug occurs when the following sequence of events happens: > >>> > >>> 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > >>> deferred to the workqueue. > >>> > >>> 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > >>> deferred to the workqueue. > >>> > >>> 3. Commit #2 starts before commit #1, dm_state #1 is used in the > >>> commit_tail and commit #2 completes, freeing dm_state #1. > >>> > >>> 4. Commit #1 starts after commit #2 completes, uses the freed dm_state > >>> 1 and dereferences a freelist pointer while setting the context. > >> > >> Well I only have a one mile high view on this, but why don't you let > >> the work items execute in order? > >> > >> That would be better anyway cause this way we don't trigger a cache > >> line ping pong between CPUs. > >> > >> Christian. > > > > We use the DRM helpers for managing drm_atomic_commit_state and those > > helpers internally push non-blocking commit work into the system > > unbound work queue. > > Mhm, well if you send those helper atomic commits in the order A,B and > they execute it in the order B,A I would call that a bug :) The way it works is it pushes all commits into unbound work queue, but then forces serialization as needed. We do _not_ want e.g. updates on different CRTC to be serialized, that would result in lots of judder. And hw is funny enough that there's all kinds of dependencies. The way you force synchronization is by adding other CRTC state objects. So if DC is busted and can only handle a single update per work item, then I guess you always need all CRTC states and everything will be run in order. But that also totally kills modern multi-screen compositors. Xorg isn't modern, just in case that's not clear :-) Lucking at the code it seems like you indeed have only a single dm state, so yeah global sync is what you'll need as immediate fix, and then maybe fix up DM to not be quite so silly ... or at least only do the dm state stuff when really needed. We could also sprinkle the drm_crtc_commit structure around a bit (it's the glue that provides the synchronization across commits), but since your dm state is global just grabbing all crtc states unconditionally as part of that is probably best. > > While we could duplicate a copy of that code with nothing but the > > workqueue changed that isn't something I'd really like to maintain > > going forward. > > I'm not talking about duplicating the code, I'm talking about fixing the > helpers. I don't know that code well, but from the outside it sounds > like a bug there. > > And executing work items in the order they are submitted is trivial. > > Had anybody pinged Daniel or other people familiar with the helper code > about it? Yeah something is wrong here, and the fix looks horrible :-) Aside, I've also seen some recent discussion flare up about drm_atomic_state_get/put used to paper over some other use-after-free, but this time related to interrupt handlers. Maybe a few rules about that: - dont - especially not when it's interrupt handlers, because you can't call drm_atomic_state_put from interrupt handlers. Instead have an spin_lock_irq to protect the shared date with your interrupt handler, and _copy_ the date over. This is e.g. what drm_crtc_arm_vblank_event does. Cheers, Daniel > > Regards, > Christian. > > > > > Regards, > > Nicholas Kazlauskas > > > >> > >>> > >>> Since this bug has only been spotted with fast commits, this patch > >>> fixes > >>> the bug by clearing the dm_state instead of using the old dc_state for > >>> fast updates. In addition, since dm_state is only used for its dc_state > >>> and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is > >>> found, > >>> removing the dm_state should not have any consequences in fast updates. > >>> > >>> This use-after-free bug has existed for a while now, but only caused a > >>> noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: > >>> relocate > >>> freelist pointer to middle of object") moving the freelist pointer from > >>> dm_state->base (which was unused) to dm_state->context (which is > >>> dereferenced). > >>> > >>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383 > >>> Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state > >>> for fast updates") > >>> Reported-by: Duncan <1i5t5.dun...@cox.net> > >>> Signed-off-by: Mazin Rezk > >>> --- > >>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 > >>> ++- > >>> 1 file changed, 27 insertions(+), 9 deletions
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
Am 27.07.20 um 16:05 schrieb Kazlauskas, Nicholas: On 2020-07-27 9:39 a.m., Christian König wrote: Am 27.07.20 um 07:40 schrieb Mazin Rezk: This patch fixes a race condition that causes a use-after-free during amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits are requested and the second one finishes before the first. Essentially, this bug occurs when the following sequence of events happens: 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is deferred to the workqueue. 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is deferred to the workqueue. 3. Commit #2 starts before commit #1, dm_state #1 is used in the commit_tail and commit #2 completes, freeing dm_state #1. 4. Commit #1 starts after commit #2 completes, uses the freed dm_state 1 and dereferences a freelist pointer while setting the context. Well I only have a one mile high view on this, but why don't you let the work items execute in order? That would be better anyway cause this way we don't trigger a cache line ping pong between CPUs. Christian. We use the DRM helpers for managing drm_atomic_commit_state and those helpers internally push non-blocking commit work into the system unbound work queue. Mhm, well if you send those helper atomic commits in the order A,B and they execute it in the order B,A I would call that a bug :) While we could duplicate a copy of that code with nothing but the workqueue changed that isn't something I'd really like to maintain going forward. I'm not talking about duplicating the code, I'm talking about fixing the helpers. I don't know that code well, but from the outside it sounds like a bug there. And executing work items in the order they are submitted is trivial. Had anybody pinged Daniel or other people familiar with the helper code about it? Regards, Christian. Regards, Nicholas Kazlauskas Since this bug has only been spotted with fast commits, this patch fixes the bug by clearing the dm_state instead of using the old dc_state for fast updates. In addition, since dm_state is only used for its dc_state and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found, removing the dm_state should not have any consequences in fast updates. This use-after-free bug has existed for a while now, but only caused a noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate freelist pointer to middle of object") moving the freelist pointer from dm_state->base (which was unused) to dm_state->context (which is dereferenced). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383 Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast updates") Reported-by: Duncan <1i5t5.dun...@cox.net> Signed-off-by: Mazin Rezk --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++- 1 file changed, 27 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 86ffa0c2880f..710edc70e37e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -8717,20 +8717,38 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, * the same resource. If we have a new DC context as part of * the DM atomic state from validation we need to free it and * retain the existing one instead. + * + * Furthermore, since the DM atomic state only contains the DC + * context and can safely be annulled, we can free the state + * and clear the associated private object now to free + * some memory and avoid a possible use-after-free later. */ - struct dm_atomic_state *new_dm_state, *old_dm_state; - new_dm_state = dm_atomic_get_new_state(state); - old_dm_state = dm_atomic_get_old_state(state); + for (i = 0; i < state->num_private_objs; i++) { + struct drm_private_obj *obj = state->private_objs[i].ptr; - if (new_dm_state && old_dm_state) { - if (new_dm_state->context) - dc_release_state(new_dm_state->context); + if (obj->funcs == adev->dm.atomic_obj.funcs) { + int j = state->num_private_objs-1; - new_dm_state->context = old_dm_state->context; + dm_atomic_destroy_state(obj, + state->private_objs[i].state); + + /* If i is not at the end of the array then the + * last element needs to be moved to where i was + * before the array can safely be truncated. + */ + if (i != j) + state->private_objs[i] = + state->private_objs[j]; - if (old_dm_state->context) - dc_retain_state(old_dm_state->context); + state->private_objs[j].ptr = NULL; +
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Mon, 27 Jul 2020 10:05:01 -0400 "Kazlauskas, Nicholas" wrote: > On 2020-07-27 9:39 a.m., Christian König wrote: > > Am 27.07.20 um 07:40 schrieb Mazin Rezk: > >> This patch fixes a race condition that causes a use-after-free > >> during amdgpu_dm_atomic_commit_tail. This can occur when 2 > >> non-blocking commits are requested and the second one finishes > >> before the first. Essentially, this bug occurs when the following > >> sequence of events happens: > >> > >> 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > >> deferred to the workqueue. > >> > >> 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > >> deferred to the workqueue. > >> > >> 3. Commit #2 starts before commit #1, dm_state #1 is used in the > >> commit_tail and commit #2 completes, freeing dm_state #1. > >> > >> 4. Commit #1 starts after commit #2 completes, uses the freed > >> dm_state 1 and dereferences a freelist pointer while setting the > >> context. > > > > Well I only have a one mile high view on this, but why don't you > > let the work items execute in order? > > > > That would be better anyway cause this way we don't trigger a cache > > line ping pong between CPUs. > > > > Christian. > > We use the DRM helpers for managing drm_atomic_commit_state and those > helpers internally push non-blocking commit work into the system > unbound work queue. > > While we could duplicate a copy of that code with nothing but the > workqueue changed that isn't something I'd really like to maintain > going forward. > > Regards, > Nicholas Kazlauskas Additionally, I don't see mentioned on-thread (it's on the bug and now in the details below) that we're talking multi-monitor, not single-monitor. Presumably that goes some way toward answering the "why not force order?" question, considering the outputs may be running at different refresh frequencies, etc... All the reports on the bug seem to be multi-monitor (after seeing multi-monitor/output in several reports I asked if anyone was seeing it with only one monitor and no answers), and as you commented on the bug for your proposed patch but seems missing from this one here (different author/proposal) ... Commits #1 and #2 don't touch any of the same core DRM objects (CRTCs, Planes, Connectors) so Commit #2 does not stall for Commit #1. DRM Private Objects have always been avoided in stall checks, so we have no safety from DRM core in this regard. > >> > >> Since this bug has only been spotted with fast commits, this patch > >> fixes the bug by clearing the dm_state instead of using the old > >> dc_state for fast updates. In addition, since dm_state is only > >> used for its dc_state and amdgpu_dm_atomic_commit_tail will retain > >> the dc_state if none is found, > >> removing the dm_state should not have any consequences in fast > >> updates. > >> > >> This use-after-free bug has existed for a while now, but only > >> caused a noticeable issue starting from 5.7-rc1 due to 3202fa62f > >> ("slub: relocate freelist pointer to middle of object") moving the > >> freelist pointer from dm_state->base (which was unused) to > >> dm_state->context (which is dereferenced). > >> > >> Bugzilla: > >> https://bugzilla.kernel.org/show_bug.cgi?id=207383 > >> > >> Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state > >> for fast updates") > >> Reported-by: Duncan <1i5t5.dun...@cox.net> > >> Signed-off-by: Mazin Rezk > >> --- > >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 > >> ++- 1 file changed, 27 insertions(+), 9 > >> deletions(-) > >> > >> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >> index 86ffa0c2880f..710edc70e37e 100644 > >> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >> @@ -8717,20 +8717,38 @@ static int amdgpu_dm_atomic_check(struct > >> drm_device *dev, > >> * the same resource. If we have a new DC context as > >> part of > >> * the DM atomic state from validation we need to free > >> it and > >> * retain the existing one instead. > >> + * > >> + * Furthermore, since the DM atomic state only contains > >> the DC > >> + * context and can safely be annulled, we can free the > >> state > >> + * and clear the associated private object now to free > >> + * some memory and avoid a possible use-after-free later. > >> */ > >> - struct dm_atomic_state *new_dm_state, *old_dm_state; > >> > >> - new_dm_state = dm_atomic_get_new_state(state); > >> - old_dm_state = dm_atomic_get_old_state(state); > >> + for (i = 0; i < state->num_private_objs; i++) { > >> + struct drm_private_obj *obj = > >> state->private_objs[i].ptr; > >> > >> - if (new_dm_state && old_dm_state) { > >> - if (new_dm_state->context) > >> - dc_release_state
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On 2020-07-27 9:39 a.m., Christian König wrote: Am 27.07.20 um 07:40 schrieb Mazin Rezk: This patch fixes a race condition that causes a use-after-free during amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits are requested and the second one finishes before the first. Essentially, this bug occurs when the following sequence of events happens: 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is deferred to the workqueue. 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is deferred to the workqueue. 3. Commit #2 starts before commit #1, dm_state #1 is used in the commit_tail and commit #2 completes, freeing dm_state #1. 4. Commit #1 starts after commit #2 completes, uses the freed dm_state 1 and dereferences a freelist pointer while setting the context. Well I only have a one mile high view on this, but why don't you let the work items execute in order? That would be better anyway cause this way we don't trigger a cache line ping pong between CPUs. Christian. We use the DRM helpers for managing drm_atomic_commit_state and those helpers internally push non-blocking commit work into the system unbound work queue. While we could duplicate a copy of that code with nothing but the workqueue changed that isn't something I'd really like to maintain going forward. Regards, Nicholas Kazlauskas Since this bug has only been spotted with fast commits, this patch fixes the bug by clearing the dm_state instead of using the old dc_state for fast updates. In addition, since dm_state is only used for its dc_state and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found, removing the dm_state should not have any consequences in fast updates. This use-after-free bug has existed for a while now, but only caused a noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate freelist pointer to middle of object") moving the freelist pointer from dm_state->base (which was unused) to dm_state->context (which is dereferenced). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383 Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast updates") Reported-by: Duncan <1i5t5.dun...@cox.net> Signed-off-by: Mazin Rezk --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++- 1 file changed, 27 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 86ffa0c2880f..710edc70e37e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -8717,20 +8717,38 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, * the same resource. If we have a new DC context as part of * the DM atomic state from validation we need to free it and * retain the existing one instead. + * + * Furthermore, since the DM atomic state only contains the DC + * context and can safely be annulled, we can free the state + * and clear the associated private object now to free + * some memory and avoid a possible use-after-free later. */ - struct dm_atomic_state *new_dm_state, *old_dm_state; - new_dm_state = dm_atomic_get_new_state(state); - old_dm_state = dm_atomic_get_old_state(state); + for (i = 0; i < state->num_private_objs; i++) { + struct drm_private_obj *obj = state->private_objs[i].ptr; - if (new_dm_state && old_dm_state) { - if (new_dm_state->context) - dc_release_state(new_dm_state->context); + if (obj->funcs == adev->dm.atomic_obj.funcs) { + int j = state->num_private_objs-1; - new_dm_state->context = old_dm_state->context; + dm_atomic_destroy_state(obj, + state->private_objs[i].state); + + /* If i is not at the end of the array then the + * last element needs to be moved to where i was + * before the array can safely be truncated. + */ + if (i != j) + state->private_objs[i] = + state->private_objs[j]; - if (old_dm_state->context) - dc_retain_state(old_dm_state->context); + state->private_objs[j].ptr = NULL; + state->private_objs[j].state = NULL; + state->private_objs[j].old_state = NULL; + state->private_objs[j].new_state = NULL; + + state->num_private_objs = j; + break; + } } } -- 2.27.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
Am 27.07.20 um 07:40 schrieb Mazin Rezk: This patch fixes a race condition that causes a use-after-free during amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits are requested and the second one finishes before the first. Essentially, this bug occurs when the following sequence of events happens: 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is deferred to the workqueue. 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is deferred to the workqueue. 3. Commit #2 starts before commit #1, dm_state #1 is used in the commit_tail and commit #2 completes, freeing dm_state #1. 4. Commit #1 starts after commit #2 completes, uses the freed dm_state 1 and dereferences a freelist pointer while setting the context. Well I only have a one mile high view on this, but why don't you let the work items execute in order? That would be better anyway cause this way we don't trigger a cache line ping pong between CPUs. Christian. Since this bug has only been spotted with fast commits, this patch fixes the bug by clearing the dm_state instead of using the old dc_state for fast updates. In addition, since dm_state is only used for its dc_state and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found, removing the dm_state should not have any consequences in fast updates. This use-after-free bug has existed for a while now, but only caused a noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate freelist pointer to middle of object") moving the freelist pointer from dm_state->base (which was unused) to dm_state->context (which is dereferenced). Bugzilla: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D207383&data=02%7C01%7Cchristian.koenig%40amd.com%7C16d6d6d4a02241deb94e08d831efa1bb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637314252605493548&sdata=JuaMFSMTjAVQBBpbXIf2RWj%2BLcx19ki25XLXbr1C6RA%3D&reserved=0 Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast updates") Reported-by: Duncan <1i5t5.dun...@cox.net> Signed-off-by: Mazin Rezk --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++- 1 file changed, 27 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 86ffa0c2880f..710edc70e37e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -8717,20 +8717,38 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, * the same resource. If we have a new DC context as part of * the DM atomic state from validation we need to free it and * retain the existing one instead. +* +* Furthermore, since the DM atomic state only contains the DC +* context and can safely be annulled, we can free the state +* and clear the associated private object now to free +* some memory and avoid a possible use-after-free later. */ - struct dm_atomic_state *new_dm_state, *old_dm_state; - new_dm_state = dm_atomic_get_new_state(state); - old_dm_state = dm_atomic_get_old_state(state); + for (i = 0; i < state->num_private_objs; i++) { + struct drm_private_obj *obj = state->private_objs[i].ptr; - if (new_dm_state && old_dm_state) { - if (new_dm_state->context) - dc_release_state(new_dm_state->context); + if (obj->funcs == adev->dm.atomic_obj.funcs) { + int j = state->num_private_objs-1; - new_dm_state->context = old_dm_state->context; + dm_atomic_destroy_state(obj, + state->private_objs[i].state); + + /* If i is not at the end of the array then the +* last element needs to be moved to where i was +* before the array can safely be truncated. +*/ + if (i != j) + state->private_objs[i] = + state->private_objs[j]; - if (old_dm_state->context) - dc_retain_state(old_dm_state->context); + state->private_objs[j].ptr = NULL; + state->private_objs[j].state = NULL; + state->private_objs[j].old_state = NULL; + state->private_objs[j].new_state = NULL; + + state->num_private_objs = j; +
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On 2020-07-27 1:40 a.m., Mazin Rezk wrote: This patch fixes a race condition that causes a use-after-free during amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits are requested and the second one finishes before the first. Essentially, this bug occurs when the following sequence of events happens: 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is deferred to the workqueue. 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is deferred to the workqueue. 3. Commit #2 starts before commit #1, dm_state #1 is used in the commit_tail and commit #2 completes, freeing dm_state #1. 4. Commit #1 starts after commit #2 completes, uses the freed dm_state 1 and dereferences a freelist pointer while setting the context. Since this bug has only been spotted with fast commits, this patch fixes the bug by clearing the dm_state instead of using the old dc_state for fast updates. In addition, since dm_state is only used for its dc_state and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found, removing the dm_state should not have any consequences in fast updates. This use-after-free bug has existed for a while now, but only caused a noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate freelist pointer to middle of object") moving the freelist pointer from dm_state->base (which was unused) to dm_state->context (which is dereferenced). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383 Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast updates") Reported-by: Duncan <1i5t5.dun...@cox.net> Signed-off-by: Mazin Rezk --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++- 1 file changed, 27 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 86ffa0c2880f..710edc70e37e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -8717,20 +8717,38 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, * the same resource. If we have a new DC context as part of * the DM atomic state from validation we need to free it and * retain the existing one instead. +* +* Furthermore, since the DM atomic state only contains the DC +* context and can safely be annulled, we can free the state +* and clear the associated private object now to free +* some memory and avoid a possible use-after-free later. */ - struct dm_atomic_state *new_dm_state, *old_dm_state; - new_dm_state = dm_atomic_get_new_state(state); - old_dm_state = dm_atomic_get_old_state(state); + for (i = 0; i < state->num_private_objs; i++) { + struct drm_private_obj *obj = state->private_objs[i].ptr; - if (new_dm_state && old_dm_state) { - if (new_dm_state->context) - dc_release_state(new_dm_state->context); + if (obj->funcs == adev->dm.atomic_obj.funcs) { + int j = state->num_private_objs-1; - new_dm_state->context = old_dm_state->context; + dm_atomic_destroy_state(obj, + state->private_objs[i].state); + + /* If i is not at the end of the array then the +* last element needs to be moved to where i was +* before the array can safely be truncated. +*/ + if (i != j) + state->private_objs[i] = + state->private_objs[j]; - if (old_dm_state->context) - dc_retain_state(old_dm_state->context); + state->private_objs[j].ptr = NULL; + state->private_objs[j].state = NULL; + state->private_objs[j].old_state = NULL; + state->private_objs[j].new_state = NULL; + + state->num_private_objs = j; + break; + } In the bug report itself I mentioned that I don't really like hacking around the DRM core for resolving this patch but to go into more specifics, it's really two issues of code maintenance: 1. It's iterating over internal structures and layout of private objects in the state and modifying the state. The core doesn't really guarantee how these things are going to be laid out and it may change in the future. 2. It's freeing an allocation we don
Re: [PATCH] drm/amd/display: Clear dm_state for fast updates
On Monday, July 27, 2020 1:40 AM, Mazin Rezk wrote: > This patch fixes a race condition that causes a use-after-free during > amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits > are requested and the second one finishes before the first. Essentially, > this bug occurs when the following sequence of events happens: > > 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is > deferred to the workqueue. > > 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is > deferred to the workqueue. > > 3. Commit #2 starts before commit #1, dm_state #1 is used in the > commit_tail and commit #2 completes, freeing dm_state #1. > > 4. Commit #1 starts after commit #2 completes, uses the freed dm_state > 1 and dereferences a freelist pointer while setting the context. > > Since this bug has only been spotted with fast commits, this patch fixes > the bug by clearing the dm_state instead of using the old dc_state for > fast updates. In addition, since dm_state is only used for its dc_state > and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found, > removing the dm_state should not have any consequences in fast updates. > > This use-after-free bug has existed for a while now, but only caused a > noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate > freelist pointer to middle of object") moving the freelist pointer from > dm_state->base (which was unused) to dm_state->context (which is > dereferenced). > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383 > Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast > updates") > Reported-by: Duncan <1i5t5.dun...@cox.net> > Signed-off-by: Mazin Rezk > --- > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++- > 1 file changed, 27 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 86ffa0c2880f..710edc70e37e 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -8717,20 +8717,38 @@ static int amdgpu_dm_atomic_check(struct drm_device > *dev, >* the same resource. If we have a new DC context as part of >* the DM atomic state from validation we need to free it and >* retain the existing one instead. > + * > + * Furthermore, since the DM atomic state only contains the DC > + * context and can safely be annulled, we can free the state > + * and clear the associated private object now to free > + * some memory and avoid a possible use-after-free later. >*/ > - struct dm_atomic_state *new_dm_state, *old_dm_state; > > - new_dm_state = dm_atomic_get_new_state(state); > - old_dm_state = dm_atomic_get_old_state(state); > + for (i = 0; i < state->num_private_objs; i++) { > + struct drm_private_obj *obj = > state->private_objs[i].ptr; > > - if (new_dm_state && old_dm_state) { > - if (new_dm_state->context) > - dc_release_state(new_dm_state->context); > + if (obj->funcs == adev->dm.atomic_obj.funcs) { > + int j = state->num_private_objs-1; > > - new_dm_state->context = old_dm_state->context; > + dm_atomic_destroy_state(obj, > + state->private_objs[i].state); > + > + /* If i is not at the end of the array then the > + * last element needs to be moved to where i was > + * before the array can safely be truncated. > + */ > + if (i != j) > + state->private_objs[i] = > + state->private_objs[j]; > > - if (old_dm_state->context) > - dc_retain_state(old_dm_state->context); > + state->private_objs[j].ptr = NULL; > + state->private_objs[j].state = NULL; > + state->private_objs[j].old_state = NULL; > + state->private_objs[j].new_state = NULL; > + > + state->num_private_objs = j; > + break; > + } > } > } > > -- > 2.27.0 > I have tested this on 5.8.0-rc6 w/ an RX 480 for 8 hours and I have not had the crash described in the Bugzilla thread. I will also be running this patch on my kernel for the next couple of days to further confirm that this is working. In addition, I will ask the users in the Bugzilla thread to test this patch