Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom

Thomas Hellstrom wrote:


Yes, it looks correct. Although it seems a little unintuitive to enter 
the loop with the spinlock held, and exit it with the spinlock not held. 
I've attached yet another patch to have that fixed. Could you take a 
look at whether it seems OK with you and, in that case, repost it for Dave?


  

... And the patch.

/Thomas

From 68972c220abe3ce19eb046d72fa218646168adc7 Mon Sep 17 00:00:00 2001
From: Luca Barbieri l...@luca-barbieri.com
Date: Mon, 18 Jan 2010 19:34:53 +0100
Subject: [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete (v2)

ttm_bo_delayed_delete has a race condition, because after we do:
kref_put(nentry-list_kref, ttm_bo_release_list);

we are not holding the list lock and not holding any reference to
objects, and thus every bo in the list can be removed and freed at
this point.

However, we then use the next pointer we stored, which is not guaranteed
to be valid.

This was apparently the cause of some Nouveau oopses I experienced.

This patch rewrites the function so that it keeps the reference to nentry
until nentry itself is freed and we already got a reference to nentry-next.

It should now be correct and free of races, but please double check this.

Updated according to Thomas Hellstrom's feedback.

Signed-off-by: Luca Barbieri l...@luca-barbieri.com
---
 drivers/gpu/drm/ttm/ttm_bo.c |   58 ++---
 1 files changed, 26 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 2920f9a..5dfa41f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -523,52 +523,46 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, bool remove_all)
 static int ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
 {
 	struct ttm_bo_global *glob = bdev-glob;
-	struct ttm_buffer_object *entry, *nentry;
-	struct list_head *list, *next;
-	int ret;
+	struct ttm_buffer_object *entry;
+	int ret = 0;
 
 	spin_lock(glob-lru_lock);
-	list_for_each_safe(list, next, bdev-ddestroy) {
-		entry = list_entry(list, struct ttm_buffer_object, ddestroy);
-		nentry = NULL;
+	if (list_empty(bdev-ddestroy)) {
+		spin_unlock(glob-lru_lock);
+		return 0;
+	}
 
-		/*
-		 * Protect the next list entry from destruction while we
-		 * unlock the lru_lock.
-		 */
+	entry = list_first_entry(bdev-ddestroy,
+		struct ttm_buffer_object, ddestroy);
+	kref_get(entry-list_kref);
+
+	for (;;) {
+		struct ttm_buffer_object *nentry = NULL;
 
-		if (next != bdev-ddestroy) {
-			nentry = list_entry(next, struct ttm_buffer_object,
-	ddestroy);
+		if (entry-ddestroy.next != bdev-ddestroy) {
+			nentry = list_first_entry(entry-ddestroy,
+struct ttm_buffer_object, ddestroy);
 			kref_get(nentry-list_kref);
 		}
-		kref_get(entry-list_kref);
 
 		spin_unlock(glob-lru_lock);
 		ret = ttm_bo_cleanup_refs(entry, remove_all);
 		kref_put(entry-list_kref, ttm_bo_release_list);
+		entry = nentry;
+
+		if (ret || !entry)
+			break;
 
 		spin_lock(glob-lru_lock);
-		if (nentry) {
-			bool next_onlist = !list_empty(next);
+		
+		if (list_empty(entry-ddestroy)) {
 			spin_unlock(glob-lru_lock);
-			kref_put(nentry-list_kref, ttm_bo_release_list);
-			spin_lock(glob-lru_lock);
-			/*
-			 * Someone might have raced us and removed the
-			 * next entry from the list. We don't bother restarting
-			 * list traversal.
-			 */
-
-			if (!next_onlist)
-break;
-		}
-		if (ret)
 			break;
+		}
 	}
-	ret = !list_empty(bdev-ddestroy);
-	spin_unlock(glob-lru_lock);
 
+	if (entry)
+		kref_put(entry-list_kref, ttm_bo_release_list);
 	return ret;
 }
 
-- 
1.6.3.3

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom

Thomas Hellstrom wrote:

Thomas Hellstrom wrote:
  
Yes, it looks correct. Although it seems a little unintuitive to enter 
the loop with the spinlock held, and exit it with the spinlock not held. 
I've attached yet another patch to have that fixed. Could you take a 
look at whether it seems OK with you and, in that case, repost it for Dave?


  


... And the patch.

/Thomas

  
Hmm, It seems I should've stayed in bed today. Hopefully this is the 
right one...


/Thomas


From b3dc72cf74d1b8e0eb5f5423fbb0ac975f147f4c Mon Sep 17 00:00:00 2001
From: Luca Barbieri l...@luca-barbieri.com
Date: Mon, 18 Jan 2010 19:34:53 +0100
Subject: [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete (v2)

ttm_bo_delayed_delete has a race condition, because after we do:
kref_put(nentry-list_kref, ttm_bo_release_list);

we are not holding the list lock and not holding any reference to
objects, and thus every bo in the list can be removed and freed at
this point.

However, we then use the next pointer we stored, which is not guaranteed
to be valid.

This was apparently the cause of some Nouveau oopses I experienced.

This patch rewrites the function so that it keeps the reference to nentry
until nentry itself is freed and we already got a reference to nentry-next.

It should now be correct and free of races, but please double check this.

Updated according to Thomas Hellstrom's feedback.

Signed-off-by: Luca Barbieri l...@luca-barbieri.com
Signed-off-by: Thomas Hellstrom thellst...@vmware.com
---
 drivers/gpu/drm/ttm/ttm_bo.c |   58 ++
 1 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index c7733c3..1a3e909 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -524,52 +524,44 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, bool remove_all)
 static int ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
 {
 	struct ttm_bo_global *glob = bdev-glob;
-	struct ttm_buffer_object *entry, *nentry;
-	struct list_head *list, *next;
-	int ret;
+	struct ttm_buffer_object *entry = NULL;
+	int ret = 0;
 
 	spin_lock(glob-lru_lock);
-	list_for_each_safe(list, next, bdev-ddestroy) {
-		entry = list_entry(list, struct ttm_buffer_object, ddestroy);
-		nentry = NULL;
+	if (list_empty(bdev-ddestroy))
+		goto out_unlock;
 
-		/*
-		 * Protect the next list entry from destruction while we
-		 * unlock the lru_lock.
-		 */
+	entry = list_first_entry(bdev-ddestroy,
+		struct ttm_buffer_object, ddestroy);
+	kref_get(entry-list_kref);
+
+	for (;;) {
+		struct ttm_buffer_object *nentry = NULL;
 
-		if (next != bdev-ddestroy) {
-			nentry = list_entry(next, struct ttm_buffer_object,
-	ddestroy);
+		if (entry-ddestroy.next != bdev-ddestroy) {
+			nentry = list_first_entry(entry-ddestroy,
+struct ttm_buffer_object, ddestroy);
 			kref_get(nentry-list_kref);
 		}
-		kref_get(entry-list_kref);
 
 		spin_unlock(glob-lru_lock);
 		ret = ttm_bo_cleanup_refs(entry, remove_all);
 		kref_put(entry-list_kref, ttm_bo_release_list);
+		entry = nentry;
+
+		if (ret || !entry)
+			goto out;
 
 		spin_lock(glob-lru_lock);
-		if (nentry) {
-			bool next_onlist = !list_empty(next);
-			spin_unlock(glob-lru_lock);
-			kref_put(nentry-list_kref, ttm_bo_release_list);
-			spin_lock(glob-lru_lock);
-			/*
-			 * Someone might have raced us and removed the
-			 * next entry from the list. We don't bother restarting
-			 * list traversal.
-			 */
-
-			if (!next_onlist)
-break;
-		}
-		if (ret)
+		if (list_empty(entry-ddestroy))
 			break;
 	}
-	ret = !list_empty(bdev-ddestroy);
-	spin_unlock(glob-lru_lock);
 
+out_unlock:
+	spin_unlock(glob-lru_lock);
+out:
+	if (entry)
+		kref_put(entry-list_kref, ttm_bo_release_list);
 	return ret;
 }
 
-- 
1.6.2.5

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Luca Barbieri
Yes it's fine. I sent your patch to Dave with an expanded commit
comment for merging.

Here is a possible redesign of the mechanism inspired by this issue.
It seems that what we are racing against is buffer eviction, due to
delayed deletion buffers being still kept on the LRU list.

I'm wondering if the delayed deletion design could be changed as follows:
1. Remove to-be-deleted buffers from the LRU list before putting them
on delayed delete
2. Change buffer eviction to first do a delayed deletion pass. This
should be cheap (and cheaper than the current code) because delayed
deletion stops at the first unsignaled fence.
3. Add a new delayed deletion lock/semaphore. Then, have
ttm_bo_delayed_delete take it for reading and keep it across the
function.
4. Inside the delayed deletion lock, grab the LRU lock, copy the
delayed delete list head to a local variable, set it to empty and
release the lock.
5. Iterate on the privately held list with list_for_each_safe
6. At the end of ttm_bo_delayed_delete, retake the LRU lock and readd
the remaining part of our private list at the head of the global list

This would prevent uselessly trying to evict delayed-delete buffers
(which should be processed in fence order and not LRU order), and also
prevent concurrent delayed deletions, which should be more harmful
than useful.

Furthermore, it should be possible to get rid of list locking in the
following way:
1. BOs to be delayed-deleted are added to the head of the initial
delayed deletion single linked list, using atomic cmpxchg
2. ttm_bo_delayed_delete takes the delayed deletion lock and grabs the
list with an atomic xchg of the head with zero
3. It reverses the list in place, processes the entries and puts them
at the end of a second single linked list, the recurring delayed
deletion list
4. It processes the recurring delayed deletion list, cleaning up the BOs
5. Finally, the delayed deletion lock is released

This makes adding to the delayed deletion list lockless.

The LRU list instead inherently needs to be doubly linked, so only RCU
could make it lockless, and it seems that may require using an
external list node structure (so readers don't suddenly jump to the
most recent node), and that would not be a win (except with *lots* of
CPUs).
Besides, most DRM drivers (except vmware) are taking the BKL around
all ioctls and (except nouveau) use a single pushbuffer, so this is a
bit moot anyway.

What do you think?

Anyway, this, if done, would be for the next merge window, or later,
while the current fix ought to be merged now.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Luca Barbieri
 Also note that the delayed delete list is not in fence order but in
 deletion-time order, which perhaps gives room for more optimizations.
You are right.
I think then that ttm_bo_delayed_delete may still need to be changed,
because it stops when ttm_bo_cleanup_refs returns -EBUSY, which
happens when a fence has not been reached.
This means that a buffer will need to wait for all previously deleted
buffers to become unused, even if it is unused itself.
Is this acceptable?

What if we get rid of the delayed destroy list, and instead append
buffers to be deleted to their fence object, and delete them when the
fence is signaled?

This also allows to do it more naturally, since the fence object can
just keep a normal reference to the buffers it fences, and unreference
them on expiration.

Then there needs to be no special delayed destruction logic, and it
would work as if the GPU were keeping a reference to the buffer
itself, using fences as a proxy to have the CPU do that work for the
GPU.

Then the delayed work is no longer periodically destroy buffers but
rather periodically check if fences are expired, naturally stopping
at the first unexpired one.
Drivers that support IRQs on fences could also do the work in the
interrupt handler/tasklet instead, avoid the delay jiffies magic
number. This may need a NAPI-like interrupt mitigation middle layer
for optimal results though.

 But isn't an atomic cmpxchg about as costly as a spinlock?
I think it's cheaper on all architectures, as otherwise it would be
mostly pointless to have it, since you can emulate it with a spinlock.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Luca Barbieri
When designing this, we should also keep in mind that some drivers
(e.g. nouveau) have multiple FIFO channels, and thus we would like a
buffer to be referenced for reading by multiple channels at once (and
be destroyed only when all fences are expired, obviously).
Also, hardware may support on-GPU inter-channel synchronization, and
then multiple references may be for writing too.

If we use an external dynamically allocated channel/buffer list node,
we can support this (if the kernel allocators aren't fast enough,
which they should be, we can just keep released ones linked to the bo
to speed allocations).

Note that in nouveau there is a small hardware limit to channels (up
to 128 on nv50), but future hardware may possibly support unlimited
channels.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom
Luca Barbieri wrote:
 Also note that the delayed delete list is not in fence order but in
 deletion-time order, which perhaps gives room for more optimizations.
 
 You are right.
 I think then that ttm_bo_delayed_delete may still need to be changed,
 because it stops when ttm_bo_cleanup_refs returns -EBUSY, which
 happens when a fence has not been reached.
 This means that a buffer will need to wait for all previously deleted
 buffers to become unused, even if it is unused itself.
 Is this acceptable?
   

Yes, I think it's acceptable if you view it in the context that the most 
important buffer resources (GPU memory space and physical system memory) 
are immediately reclaimable through the eviction- and swapping mechanisms.

 What if we get rid of the delayed destroy list, and instead append
 buffers to be deleted to their fence object, and delete them when the
 fence is signaled?

 This also allows to do it more naturally, since the fence object can
 just keep a normal reference to the buffers it fences, and unreference
 them on expiration.

 Then there needs to be no special delayed destruction logic, and it
 would work as if the GPU were keeping a reference to the buffer
 itself, using fences as a proxy to have the CPU do that work for the
 GPU.

 Then the delayed work is no longer periodically destroy buffers but
 rather periodically check if fences are expired, naturally stopping
 at the first unexpired one.
 Drivers that support IRQs on fences could also do the work in the
 interrupt handler/tasklet instead, avoid the delay jiffies magic
 number. This may need a NAPI-like interrupt mitigation middle layer
 for optimal results though.

   

Yes, I think that this way, it should definitely be possible to find a 
more optimal solution. One should keep in mind, however, that we'll 
probably not able to destroy buffers from within an atomic context, 
which means we have to schedule a workqueue to do that task. We had to 
do a similar thing in the Poulsbo driver and it turned out that we could 
save a significant amount of CPU by using a delayed workqueue, 
collecting objects and destroying them periodically.

/Thomas

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom
Luca Barbieri wrote:
 When designing this, we should also keep in mind that some drivers
 (e.g. nouveau) have multiple FIFO channels, and thus we would like a
 buffer to be referenced for reading by multiple channels at once (and
 be destroyed only when all fences are expired, obviously).
 Also, hardware may support on-GPU inter-channel synchronization, and
 then multiple references may be for writing too.
   

In the context of the current code, I've been thinking of having a list 
of fences on each bo to support multiple readers, and also to try to 
deal with the problem of simultaneous GPU- and CPU readers.

But if the hardware supports on-GPU inter-channel synchronization, I 
figure the code should be smart enough to only wait on the last write 
fence?

/Thomas

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [mesa PATCH] nv50: fix crash in nv50_pre_pipebuffer_map (nv50_screen-cur_ctx)

2010-01-20 Thread Marcin Slusarz
nv50_pre_pipebuffer_map references screen-cur_ctx which points
to freed memory after the context is destroyed.
This crash is easily triggerable by progs/xdemos/glxcontexts.
---
 src/gallium/drivers/nv50/nv50_context.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/src/gallium/drivers/nv50/nv50_context.c 
b/src/gallium/drivers/nv50/nv50_context.c
index d598f0e..952741c 100644
--- a/src/gallium/drivers/nv50/nv50_context.c
+++ b/src/gallium/drivers/nv50/nv50_context.c
@@ -82,6 +82,10 @@ nv50_destroy(struct pipe_context *pipe)
so_ref(NULL, nv50-state.vtxattr);
 
draw_destroy(nv50-draw);
+   
+   if (nv50-screen-cur_ctx == nv50)
+   nv50-screen-cur_ctx = NULL;
+   
FREE(nv50);
 }
 
-- 
1.6.6

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau