[PATCH] drm/nouveau: fix locking in nouveau_crtc_page_flip

2013-07-02 Thread Dave Airlie
On Mon, Jul 1, 2013 at 8:00 PM, Maarten Lankhorst
 wrote:
> This is a bit messed up because chan->cli->mutex is a different class,
> depending on whether it is the global drm client or not. This is
> because the global cli->mutex lock can be taken for eviction,
> so locking it before pinning the buffer objects may result in a deadlock.

conditional mutex locking made me free a bit ill,

Ben any cleaner way to do this? if not I'll merge this.

Dave.


[PATCH] drm/nouveau: fix locking in nouveau_crtc_page_flip

2013-07-01 Thread Maarten Lankhorst
This is a bit messed up because chan->cli->mutex is a different class,
depending on whether it is the global drm client or not. This is
because the global cli->mutex lock can be taken for eviction,
so locking it before pinning the buffer objects may result in a deadlock.

The locking order from outer to inner is:
- >mutex
- ttm_bo
- _client_lock (global >mutex)

Fixes the following lockdep splat:

==
[ INFO: possible circular locking dependency detected ]
3.10.0-rc7+ #101 Not tainted
---
mplayer/4979 is trying to acquire lock:
 (_client_lock_class_key){+.+.+.}, at: [] 
nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]

but task is already holding lock:
 (reservation_ww_class_mutex){+.+.+.}, at: [] 
ttm_bo_vm_fault+0x47/0x394 [ttm]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (reservation_ww_class_mutex){+.+.+.}:
   [] lock_acquire+0x90/0x1f9
   [] mutex_lock_nested+0x56/0x3bb
   [] nouveau_bo_pin+0x3c/0x15b [nouveau]
   [] nouveau_crtc_page_flip+0xef/0x68b [nouveau]
   [] drm_mode_page_flip_ioctl+0x289/0x336 [drm]
   [] drm_ioctl+0x4d3/0x619 [drm]
   [] do_vfs_ioctl+0x90/0x50a
   [] SyS_ioctl+0x8f/0x9e
   [] tracesys+0xdd/0xe2

-> #0 (_client_lock_class_key){+.+.+.}:
   [] __lock_acquire+0x1c29/0x1c2b
   [] lock_acquire+0x90/0x1f9
   [] mutex_lock_nested+0x56/0x3bb
   [] nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]
   [] nouveau_bo_move+0xb9/0x3cb [nouveau]
   [] ttm_bo_handle_move_mem+0x24e/0x6b0 [ttm]
   [] ttm_bo_move_buffer+0x157/0x164 [ttm]
   [] ttm_bo_validate+0xa0/0x129 [ttm]
   [] nouveau_bo_validate+0x1c/0x1e [nouveau]
   [] nouveau_ttm_fault_reserve_notify+0xd0/0xd7 [nouveau]
   [] ttm_bo_vm_fault+0x69/0x394 [ttm]
   [] __do_fault+0x6e/0x496
   [] handle_pte_fault+0x84/0x861
   [] handle_mm_fault+0x1e2/0x2b1
   [] __do_page_fault+0x15e/0x517
   [] do_page_fault+0x37/0x6b
   [] page_fault+0x22/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(reservation_ww_class_mutex);
   lock(_client_lock_class_key);
   lock(reservation_ww_class_mutex);
  lock(_client_lock_class_key);

 *** DEADLOCK ***

2 locks held by mplayer/4979:
 #0:  (>mmap_sem){++}, at: [] 
__do_page_fault+0xff/0x517
 #1:  (reservation_ww_class_mutex){+.+.+.}, at: [] 
ttm_bo_vm_fault+0x47/0x394 [ttm]

stack backtrace:
CPU: 3 PID: 4979 Comm: mplayer Not tainted 3.10.0-rc7+ #101
Hardware name: Acer Aspire M3985/Aspire M3985, BIOS P01-A1 03/12/2012
 8214cdf0 8803c8549678 816eb3d8 8803c85496c8
 816e5f4f 8803c85f86e0 8803c8549750 8803c85f8708
 8803c85f8000 8803c85f8708 8803c85f8730 8214cdf0
Call Trace:
 [] dump_stack+0x19/0x1b
 [] print_circular_bug+0x1fb/0x20c
 [] __lock_acquire+0x1c29/0x1c2b
 [] lock_acquire+0x90/0x1f9
 [] ? nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]
 [] ? mark_held_locks+0x6d/0x117
 [] mutex_lock_nested+0x56/0x3bb
 [] ? nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]
 [] ? trace_hardirqs_on+0xd/0xf
 [] nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]
 [] nouveau_bo_move+0xb9/0x3cb [nouveau]
 [] ttm_bo_handle_move_mem+0x24e/0x6b0 [ttm]
 [] ? ttm_bo_mem_space+0x18e/0x3e1 [ttm]
 [] ttm_bo_move_buffer+0x157/0x164 [ttm]
 [] ? __lock_is_held+0x4f/0x63
 [] ttm_bo_validate+0xa0/0x129 [ttm]
 [] nouveau_bo_validate+0x1c/0x1e [nouveau]
 [] nouveau_ttm_fault_reserve_notify+0xd0/0xd7 [nouveau]
 [] ttm_bo_vm_fault+0x69/0x394 [ttm]
 [] __do_fault+0x6e/0x496
 [] ? __mutex_unlock_slowpath+0xef/0x189
 [] handle_pte_fault+0x84/0x861
 [] ? __do_page_fault+0xff/0x517
 [] ? drm_ioctl+0x12f/0x619 [drm]
 [] handle_mm_fault+0x1e2/0x2b1
 [] __do_page_fault+0x15e/0x517
 [] ? avc_has_perm_flags+0x3c/0x204
 [] ? rcu_eqs_enter_common.isra.61+0x9c/0x1c6
 [] ? user_enter+0x75/0xaf
 [] ? rcu_eqs_exit+0x5c/0x9c
 [] do_page_fault+0x37/0x6b
 [] page_fault+0x22/0x30

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nouveau_display.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
b/drivers/gpu/drm/nouveau/nouveau_display.c
index 5ee23a4..d89d3ef 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -595,7 +595,8 @@ nouveau_crtc_page_flip(struct drm_crtc *crtc, struct 
drm_framebuffer *fb,
chan = drm->channel;
spin_unlock(_bo->bo.bdev->fence_lock);

-   mutex_lock(>cli->mutex);
+   if (chan->cli != >client)
+   mutex_lock(>cli->mutex);

if (new_bo != old_bo) {
ret = nouveau_bo_pin(new_bo, TTM_PL_FLAG_VRAM);
@@ -613,7 +614,8 @@ 

[PATCH] drm/nouveau: fix locking in nouveau_crtc_page_flip

2013-07-01 Thread Maarten Lankhorst
This is a bit messed up because chan-cli-mutex is a different class,
depending on whether it is the global drm client or not. This is
because the global cli-mutex lock can be taken for eviction,
so locking it before pinning the buffer objects may result in a deadlock.

The locking order from outer to inner is:
- cli-mutex
- ttm_bo
- drm_client_lock (global cli-mutex)

Fixes the following lockdep splat:

==
[ INFO: possible circular locking dependency detected ]
3.10.0-rc7+ #101 Not tainted
---
mplayer/4979 is trying to acquire lock:
 (drm_client_lock_class_key){+.+.+.}, at: [a0346b66] 
nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]

but task is already holding lock:
 (reservation_ww_class_mutex){+.+.+.}, at: [a019ab8b] 
ttm_bo_vm_fault+0x47/0x394 [ttm]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

- #1 (reservation_ww_class_mutex){+.+.+.}:
   [810b9dbd] lock_acquire+0x90/0x1f9
   [816ed517] mutex_lock_nested+0x56/0x3bb
   [a0347e9c] nouveau_bo_pin+0x3c/0x15b [nouveau]
   [a0353938] nouveau_crtc_page_flip+0xef/0x68b [nouveau]
   [a0276595] drm_mode_page_flip_ioctl+0x289/0x336 [drm]
   [a02651b6] drm_ioctl+0x4d3/0x619 [drm]
   [81196278] do_vfs_ioctl+0x90/0x50a
   [81196781] SyS_ioctl+0x8f/0x9e
   [816fa6a4] tracesys+0xdd/0xe2

- #0 (drm_client_lock_class_key){+.+.+.}:
   [810b9729] __lock_acquire+0x1c29/0x1c2b
   [810b9dbd] lock_acquire+0x90/0x1f9
   [816ed517] mutex_lock_nested+0x56/0x3bb
   [a0346b66] nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]
   [a034745f] nouveau_bo_move+0xb9/0x3cb [nouveau]
   [a01979e3] ttm_bo_handle_move_mem+0x24e/0x6b0 [ttm]
   [a0198da4] ttm_bo_move_buffer+0x157/0x164 [ttm]
   [a0198e51] ttm_bo_validate+0xa0/0x129 [ttm]
   [a0347c80] nouveau_bo_validate+0x1c/0x1e [nouveau]
   [a0347d52] nouveau_ttm_fault_reserve_notify+0xd0/0xd7 [nouveau]
   [a019abad] ttm_bo_vm_fault+0x69/0x394 [ttm]
   [8114eaed] __do_fault+0x6e/0x496
   [811515fb] handle_pte_fault+0x84/0x861
   [81152de4] handle_mm_fault+0x1e2/0x2b1
   [816f5fec] __do_page_fault+0x15e/0x517
   [816f63dc] do_page_fault+0x37/0x6b
   [816f3122] page_fault+0x22/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(reservation_ww_class_mutex);
   lock(drm_client_lock_class_key);
   lock(reservation_ww_class_mutex);
  lock(drm_client_lock_class_key);

 *** DEADLOCK ***

2 locks held by mplayer/4979:
 #0:  (mm-mmap_sem){++}, at: [816f5f8d] 
__do_page_fault+0xff/0x517
 #1:  (reservation_ww_class_mutex){+.+.+.}, at: [a019ab8b] 
ttm_bo_vm_fault+0x47/0x394 [ttm]

stack backtrace:
CPU: 3 PID: 4979 Comm: mplayer Not tainted 3.10.0-rc7+ #101
Hardware name: Acer Aspire M3985/Aspire M3985, BIOS P01-A1 03/12/2012
 8214cdf0 8803c8549678 816eb3d8 8803c85496c8
 816e5f4f 8803c85f86e0 8803c8549750 8803c85f8708
 8803c85f8000 8803c85f8708 8803c85f8730 8214cdf0
Call Trace:
 [816eb3d8] dump_stack+0x19/0x1b
 [816e5f4f] print_circular_bug+0x1fb/0x20c
 [810b9729] __lock_acquire+0x1c29/0x1c2b
 [810b9dbd] lock_acquire+0x90/0x1f9
 [a0346b66] ? nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]
 [810ba731] ? mark_held_locks+0x6d/0x117
 [816ed517] mutex_lock_nested+0x56/0x3bb
 [a0346b66] ? nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]
 [810ba99e] ? trace_hardirqs_on+0xd/0xf
 [a0346b66] nouveau_bo_move_m2mf.isra.13+0x4d/0x130 [nouveau]
 [a034745f] nouveau_bo_move+0xb9/0x3cb [nouveau]
 [a01979e3] ttm_bo_handle_move_mem+0x24e/0x6b0 [ttm]
 [a01989fa] ? ttm_bo_mem_space+0x18e/0x3e1 [ttm]
 [a0198da4] ttm_bo_move_buffer+0x157/0x164 [ttm]
 [810b5966] ? __lock_is_held+0x4f/0x63
 [a0198e51] ttm_bo_validate+0xa0/0x129 [ttm]
 [a0347c80] nouveau_bo_validate+0x1c/0x1e [nouveau]
 [a0347d52] nouveau_ttm_fault_reserve_notify+0xd0/0xd7 [nouveau]
 [a019abad] ttm_bo_vm_fault+0x69/0x394 [ttm]
 [8114eaed] __do_fault+0x6e/0x496
 [816ef539] ? __mutex_unlock_slowpath+0xef/0x189
 [811515fb] handle_pte_fault+0x84/0x861
 [816f5f8d] ? __do_page_fault+0xff/0x517
 [a0264e12] ? drm_ioctl+0x12f/0x619 [drm]
 [81152de4] handle_mm_fault+0x1e2/0x2b1
 [816f5fec] __do_page_fault+0x15e/0x517
 [8132e8f5] ? avc_has_perm_flags+0x3c/0x204
 

Re: [PATCH] drm/nouveau: fix locking in nouveau_crtc_page_flip

2013-07-01 Thread Dave Airlie
On Mon, Jul 1, 2013 at 8:00 PM, Maarten Lankhorst
maarten.lankho...@canonical.com wrote:
 This is a bit messed up because chan-cli-mutex is a different class,
 depending on whether it is the global drm client or not. This is
 because the global cli-mutex lock can be taken for eviction,
 so locking it before pinning the buffer objects may result in a deadlock.

conditional mutex locking made me free a bit ill,

Ben any cleaner way to do this? if not I'll merge this.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel