Am Mittwoch, den 09.02.2005, 22:12 +0100 schrieb Felix Kühling: 
> Am Mittwoch, den 09.02.2005, 20:58 +0100 schrieb Roland Scheidegger:
[snip] 
> > Performance with gart texturing, even in 4x mode, takes a big hit 
> > (almost 50%).
> > I was not really able to get consistent performance results when both 
> > texture heaps were active, I guess it's luck of the day which textures 
> > got put in the gart heap and which ones in the local heap. But that 
> > performance indeed got faster with a smaller gart heap is not a good 
> > sign. And even if the maximum obtained in rtcw with 35MB local heap and 
> > 29MB gart heap was higher than the score obtained with 35MB local heap 
> > alone, there were clearly areas which ran faster with only the local heap.
> > It seems to me that the allocator really should try harder to use the 
> > local heap to be useful on r200 cards, moreover it is likely that you'd 
> > get quite a bit better performance when you DO have to put textures into 
> > the gart heap when you revisit that later when more space becomes 
> > available on the local heap and upload the still-used textures from the 
> > gart heap to the local heap (in fact, should be even faster than those 
> > 650MB/s, since no in-kernel-copy would be needed, it should be possible 
> > to blit it directly).
> 
> The big problem with the current texture allocator is that it can't tell
> which areas are really unused. Texture space is only allocated and never
> freed. Once the memory is "full" it starts kicking textures to upload
> new ones. This is the only way of "freeing" memory. Using an LRU
> strategy it has a good chance of kicking unused textures first, but
> there's no guarantee. It can't tell if a kicked texture will be needed
> the next instant. So trying to move textures from GART to local memory
> would basically mean that you blindly kick the least recently used
> texture(s) from local memory. If those textures are needed again soon
> then performance is going to suffer badly.
> 
> Therefore I'm proposing a modified allocator that fails when it needs to
> start kicking too recently used textures (e.g. textures used in the
> current or previous frame). Failure would not be fatal in this case, you
> just keep the texture in GART memory and try again later. Actually you
> could use the same allocator for normal texture uploads. Just specify
> the current texture heap age as the limit.
> 
> If you try to move textures back to local memory each time a texture is
> used, this would result in some kind of automatic regulation of heap
> usage. By kicking only textures that are several frames old in this
> process, you'd avoid trashing.
> 
> Currently the texture heap age is only incremented on lock contention
> (IIRC). In this scheme you'd also increment it on buffer swaps and
> remember the texture heap ages of the last two buffer swaps.

I simplified this idea a little further and attached a patch against
texmem.[ch]. It frees stale textures (and also place holders for other
clients' textures) that havn't been used in 1 second when it runs out of
space on a texture heap. This way it will try a bit harder to put
textures into the first heap before using the second heap, without much
risk (I hope) of performance regressions.

I tested this on a ProSavageDDR where rendering speed appears to be the
same with local and GART textures. There was no measurable performance
regression in Quake3 and I noticed no subjective performance regression
in Torcs or Quake1 either.

Now the only thing missing in texmem.c for migrating textures from GART
to local memory would be a flag to driAllocateTexture to stop trying if
kicking stale textures didn't free up enough space (on the first texture
heap).

Anyway, I think the attached patch should already make a difference as
it is. I'd be interested how much it improves your performance numbers
with Quake3 and rtcw on r200 when both texture heaps are enabled.

> 
[snip]

Regards,
  Felix

-- 
| Felix Kühling <[EMAIL PROTECTED]>                     http://fxk.de.vu |
| PGP Fingerprint: 6A3C 9566 5B30 DDED 73C3  B152 151C 5CC1 D888 E595 |
--- ./texmem.h.~1.6.~	2005-02-02 17:20:40.000000000 +0100
+++ ./texmem.h	2005-02-10 17:44:40.000000000 +0100
@@ -101,6 +101,11 @@
 					 * value must be greater than
 					 * or equal to \c firstLevel.
 					 */
+
+	double      clockAge;		/**< Clock time stamp indicating when
+					 * the texture was last used. The unit
+					 * is seconds.
+					 */
 };
 
 
--- ./texmem.c.~1.10.~	2005-02-05 14:16:25.000000000 +0100
+++ ./texmem.c	2005-02-10 18:39:15.000000000 +0100
@@ -50,6 +50,7 @@
 #include "texformat.h"
 
 #include <assert.h>
+#include <sys/time.h>
 
 
 
@@ -243,6 +244,13 @@
        */
 
       move_to_head( & heap->texture_objects, t );
+      {
+	 struct timeval tv;
+	 if ( gettimeofday( &tv, NULL ) == 0 ) {
+	    t->clockAge = (double)tv.tv_sec + (double)tv.tv_usec / 1e6;
+	 } else
+	    t->clockAge = 0.0;
+      }
 
 
       for (i = start ; i <= end ; i++) {
@@ -415,6 +423,15 @@
       t->heap = heap;
       if (in_use) 
 	 t->bound = 99;
+
+      {
+	 struct timeval tv;
+	 if ( gettimeofday( &tv, NULL ) == 0 ) {
+	    t->clockAge = (double)tv.tv_sec + (double)tv.tv_usec / 1e6;
+	 } else
+	    t->clockAge = 0.0;
+      }
+
       insert_at_head( & heap->texture_objects, t );
    }
 }
@@ -477,6 +494,50 @@
 
 
 
+/**
+ * Free stale textures
+ *
+ * \param heap      The heap from which to kick stale textures
+ * \param seconds   Kick textures unused for this many seconds
+ */
+
+static void
+driFreeStaleTextures( driTexHeap * heap, double seconds )
+{
+   driTextureObject * temp;
+   driTextureObject * cursor;
+   struct timeval tv;
+   double curTime;
+   if ( gettimeofday( &tv, NULL ) != 0 )
+      return;
+   curTime = (double)tv.tv_sec + (double)tv.tv_usec / 1e6;
+
+   if ( heap == NULL )
+      return;
+
+   for ( cursor = heap->texture_objects.prev, temp = cursor->prev;
+	 cursor != &heap->texture_objects ; 
+	 cursor = temp, temp = cursor->prev ) {
+
+      /* only consider our own textures that are not currently bound */
+      if ( cursor->bound || !cursor->tObj ) {
+	 continue;
+      }
+
+      if ( curTime - cursor->clockAge > seconds ) {
+	 driSwapOutTextureObject( cursor );
+      }
+      /* Since textures are LRU sorted, it should be safe to terminate
+       * this loop once the first texture is kept. */
+      else {
+	 break;
+      }
+   }
+}
+
+
+
+
 #define INDEX_ARRAY_SIZE 6 /* I'm not aware of driver with more than 2 heaps */
 
 /**
@@ -514,7 +575,7 @@
 
 
    /* Run through each of the existing heaps and try to allocate a buffer
-    * to hold the texture.
+    * to hold the texture. If this fails, free stale textures and try again.
     */
 
    for ( id = 0 ; (t->memBlock == NULL) && (id < nr_heaps) ; id++ ) {
@@ -522,6 +583,11 @@
       if ( heap != NULL ) {
 	 t->memBlock = mmAllocMem( heap->memory_heap, t->totalSize, 
 				   heap->alignmentShift, 0 );
+	 if ( t->memBlock == NULL ) {
+	    driFreeStaleTextures( heap, 1.0 );
+	    t->memBlock = mmAllocMem( heap->memory_heap, t->totalSize, 
+				      heap->alignmentShift, 0 );
+	 }
       }
    }
 

Reply via email to