Am Donnerstag, den 10.02.2005, 17:40 -0500 schrieb Jon Smirl: 
> On Thu, 10 Feb 2005 23:13:30 +0100, Felix Kühling <[EMAIL PROTECTED]> wrote:
> > This scheme would give good results with movie players that need fast
> > texture uploads and typically use each texture exactly once. It would
> 
> Movie players aren't even close to being texture bandwidth bound. The

That's not my experience. Optimizations in the texture upload path,
using the AGP heap and partial texture uploads had a big impact on
mplayer -vo gl performance on my ProSavageDDR (factor 2-3 all of them
taken together).

> demote from local to AGP scheme would cause two copies on each frame
> but there is plenty of bandwidth. But this assumes that the movie
> player creates a new texture for each frame.
> 
> A better scheme for a movie player would be to create a single texture
> and then keep replacing it's contents.

You're right, that's what actually happens in mplayer. It uses
glTexSubImage2D because it typically changes only a part of a texture
with power-of-two dimensions.

> Or use two textures and double
> buffer. But once created these textures would not move in the LRU list
> unless you started something like a game in another window.

Yes, they would move in the LRU list. That's why it's called "least
recently used" not "least recently created". ;-)

So I would have to modify my scheme to reset the usage count/frequency
when a texture image is changed, such that a texture that is updated
very frequently would not be promoted to local memory.

Am Donnerstag, den 10.02.2005, 17:34 -0500 schrieb Jon Smirl:
> On Thu, 10 Feb 2005 23:13:30 +0100, Felix Kühling <[EMAIL PROTECTED]> wrote:
> > This means you copy a texture when you don't know if or when you're
> > going to need it again. So the move of the texture may just be a waste
> > of time. It would be better to just kick the texture and upload it again
> > later when it's really needed.
> 
> I suspect this extra texture copy wouldn't be noticable except when
> you construct a test program which articifically triggers it. Most
> games will achieve a steady state with their loaded textures after a
> frame or two and the copies will stop.

Still this copy is unnecessary at the time. Delaying the re-upload to
the time when the texture is needed again has only advantages and is not
difficult to implement.

> 
> > I'd rather reverse your scheme. Upload a texture to the GART heap first,
> > because that's potentially faster (though not with the current
> > implementation in the radeon drivers). When the texture is needed more
> > frequently, try promoting it to the local texture heap.
> 
> I thought about this, but there is no automatic way to figure out when
> to promote from GART to local.

Yes there is. In the current scheme, whenever a texture is bound to a
hardware tex unit the driver calls driUpdateTexLRU, which moves the
texture to the front of the LRU list. In this function you could easily
count how often or how frequently a texture has been used. Based on this
information and maybe the texture size you could decide which textures
to promote and when. You will keep promoting textures until the local
heap is full of non-stale textures.

>  Same problem when local overflows, what
> do you demote to AGP? You still have copies with this scheme too.

Textures are sorted in LRU-order on the texture heaps. So you always
kick least recently used textures first. It has always worked like this
even in the current scheme. For promoting textures I would only kick
stale textures from the local heap.

> 
> Going first to local and then demoting to AGP sorts everything
> automatically. It may cause a little more churn in the heaps,

In my experience texture uploads are quite expensive. So IMO avoiding
unnecessary texture uploads or copies should have a high priority.

>  but the
> advantage is that the algorithm is very simple and doesn't need much
> tuning. The only tunable parameter is determining when the top of the
> AGP heap is "hot" and booting it. You could use something simple like
> boot after 500 accesses.

I don't think my algorithm is much more complicated. It can be
implemented by gradual improvements of the current algorithm (freeing
stale texture memory is one step) which helps avoiding unexpected
performance regressions. At the moment I'm not planning to rewrite it
from scratch, especially because I can't test on any hardware where I
can actually measure great performance improvements ATM.

The only tunable parameter in my algorithm is how often/frequently used
a texture must be in order to try to promote it to the local texture
heap. Maybe there are a few more degrees of freedom, because you can
also consider the texture size for promotion. I think the steady state
result would be about the same as with your algorithm, but I expect my
scheme to work better when textures are used very infrequently or
updated very frequently (movie players). In particular this would make
the texture_heaps option unnecessary, which is a good think IMO (good
performance without tuning is good for Joe Average User).

Anyway, anyone is free to implement an alternative algorithm for
comparison. If it works better, then it will be adopted. However, I'm
not convinced your algorithm is going to work better than mine (you
asked for my opinion, didn't you), so I'm not going to implement it.

-- 
| Felix Kühling <[EMAIL PROTECTED]>                     http://fxk.de.vu |
| PGP Fingerprint: 6A3C 9566 5B30 DDED 73C3  B152 151C 5CC1 D888 E595 |



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to