Jeff Hartmann wrote: > > Keith Whitwell wrote: > > > Benjamin Herrenschmidt wrote: > > > >>> HOWEVER, if you tied the GART mapping to the DRM lock, you might be ok. > >>> That gives you the required system exclusion, and if you make it an > >>> explicit "get my GART context" function that is only called under > >>> the DRM > >>> lock _and_ only called when you actually need the AGP access, you also > >>> avoid the unnecessary context switches. > >>> > >>> You might still have some performance issues simply because you > >>> would do > >>> extra work when switching aperture mappings, but hopefully the GART > >>> switch > >>> wouldn't be a common operation. > >>> > >>> The flexibility you would get _might_ be worth it. > >>> > >> > >> Well, I would personally vote for the processes _not_ relying on having > >> the AGP aperture mapped directly, but instead, the various memory pages > >> making their AGP aperture. Several chipsets (Apple ones for sure, but it > >> seems others are hitting this too nowadays) don't support AGP aperture > >> accesses from the CPU. > > > > > > What are you actually saying, that pages mapped in agp can't be > > written by any means, or just that they can't be written through the > > agp address range? > > > > It sounds kindof broken to me in any case. How to mtrrs work in this > > world? > > Actually we should go to this model eventually. However it needs me to > have time to finish the Page Attribute Table support I started on at > VA. This allows write combining to be set on a per page basis, and is > the direction we want to go even on x86. > > > > > > >> That way, if you want several AGP contexts, you can have the processes > >> tapping their AGP buffers without lock, locking would only be required > >> once it's time to move one of these buffers in/out the physical GART > >> under the arbitration of the DRM. > > > > > > You don't need to lock to write to agp buffers in the current scheme. > > > > You also don't need to play with the gart table just to draw a > > 2-triangle strip. On some chipsets, particularly under smp, > > modifying the gart table is very slow. Ask Jeff about this. > > > > Keith > > > This is also true, but I've done alot of heavy think on this very > issue. The key is to manage the agp aperture and only swap out regions > when you absolutely have too. The big key to getting something like > this to work is a memory manager that every client uses, and is based on > some sort of sarea. It should be designed with a certain minimum block > size, and have a few different flags for what kind of usage that memory > block has. (I can go into more detail on design, but you probably have > a good idea what I mean here.) Then the next step is to create kernel > calls which can swap things to an from agp space and the card. One > cards that support it, another path (which prevents GART rewrites > entirely) is to add support to swap to normal cached memory. > This is what I envision making sense in the long run. A global > memory manager using an sarea (doesn't have to be the main one) and a > good aging mechanism get us most of the way there.
Jeff, It might be helpful to clarify the different uses we are discussing WRT to AGP. In this thread so far, we've been jumping all over. Here's a shot at an AGP breakdown. Feel free to correct my misconceptions. 1) The original utilization of AGP under Linux is faster MMIO transactions than PCI. Some level of improvement happens here by simply accessing a device on an AGP bus, and no special AGP programming is required. 2) Simple MMIO transactions can be optimized by enabling fast writes. This case is identical to the MMIO transactions in the first case, but the bus and graphics chipset utilize hardware pipelining to increase thrueput. There is a penalty for turning the bus around write/read/write/read because of the pipelining. There are also certain combinations of host chipsets and graphics chips where enabling fast writes can cause hangs. The remaining cases all utilize AGP bus mastering where the graphics chip can read and write directly from AGP memory. 3) Static AGP Allocation. This is the primary functionality that the agpgart module provides today. Physical memory is allocated by agpgart as needed and that memory is managed on behalf of the user space and DRM drivers at run time. There is a finite amount of this memory available dictated by the size of the AGP apperature (typically 64M). We have not fully exploited this case in user space, yet. The prototype for the AGP allocator and transfer mechanism of glDrawPixels in the Matrox G400 driver is a good example of the potential here. 4) Dynamic AGP Binding. This functionality is spec'ed in the agpgart interface but is not fully implemented, yet. The intention is for user space processes to be able to bind normal virtual pages to the AGP apperature in a very dynamic fashion. Some of the discussions about binding and unbinding virtual memory make this option sound less appealing. Linus indicated it would probably be more efficient to just copy the virtual memory to an uncached AGP page from the static allocator case. 5) AGP Swapping w/ Graphics HW access only. The hope here is that the graphics hardware could somehow utilize more memory than could fit in the apperature at any given time. This would be useful for efficiently swapping in and out large chunks of data only accessed by the graphics hardware. No need for VM access by the host, just a need to virtualize many instances of this type of data when swapping between graphics contexts (think private back buffers, extended texture store, etc). If a subset of the AGP apperature could be backed by a much larger number of uncached system memory pages, this would be a very useful mechanism indeed. We could even utilize the drmLock to protect access to these pages if that helped to minimize the TLB flush issues. I don't know if this can be done efficiently. 6) AGP Swapping w/ Host access to Swapped in pages. Same as case 5, but we would also like the host to be able to access the pages when they are swapped in. This case would make it possible to put AGP textures in the swapped space. 7) AGP Swapping w/ Host access to all pages. Same as case 5, but the host would also be able to access all AGP pages regardless of whether they are swapped in our not. This is alot like the case 4, Dynamic AGP Binding except the memory would be allocated from agpgart first. I don't know if accessing swapped out pages has any immediate value. These are the cases I had in mind. It seems we were crossing between case 4, 5 and 6 in our discussion of AGP memory management. I hope this breakdown can add some clarity. Regards, Jens -- /\ Jens Owen / \/\ _ [EMAIL PROTECTED] / \ \ \ Steamboat Springs, Colorado ---------------------------------------------------------------------------- Bringing you mounds of caffeinated joy >>> http://thinkgeek.com/sf <<< _______________________________________________ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel