On Thursday 13 October 2005 07:51, Michel Dänzer wrote: > There's no question that the override is useful for developers, the > question is whether it isn't more harm- than useful for users.
I've often thought it'd be nice to have the VideoRAM option in the config file be clamped to the max(user specified, driver probed), with some magic value the driver could specify to say it has no real idea how much vram is available. > > And, the driver also limits texture memory to only be useable up to > > 128MB, and I think this is not necessary (as textures are always blitted > > using the gpu and the memory used by them never touched directly by the > > cpu) or is it? > > Indeed, that memory would probably be useful for textures for now, but > maybe CPU access to textures in the framebuffer will be necessary in the > future? I don't think so. For fixed function cards, the numbers I've been getting while playing with accelerating XGetImage and XPutImage in EXA suggest that even for fairly small updates to offscreen images (about an 8x8 tile update or so), it's faster to download the subimage you're interested in, modify it in host RAM, and re-upload it, than it is to do CPU-driven access directly. XGetImage of XYPixmaps is a good example, where DMAing the pixmap down from the framebuffer and then converting ZPixmap to XYPixmap in host memory is between 3 to 12 times faster than the normal software path. For cards with useful fragment shaders, it'd be really really hot to see the server's fb layer implemented in fragment shaders and do even core X rendering entirely on-card. This is basically the Quartz 2D Extreme model. Again, you need to get this data off the card sometimes for things like glReadPixels or XGetImage, but that should really be done with DMA, or a proper memcpy at minimum. Think of it as manual cache management. Block transfers are fairly quick, and modifying data within a memory domain is really fast, but single-word updates between domains are just painful. So I guess to answer your question, memory outside the BAR is fine to only use for textures, because if the host really wants to modify them it should do so only between DFS and UTS pairs, and presumably the GPU can use its entire address space for DMA sources and targets rather than just only the range visible through the PCI bus aperture. - ajax
pgpxozj5uXT8l.pgp
Description: PGP signature