Re: [Dri-devel] R200 kernel interfaces

Jens Owen Tue, 18 Jun 2002 20:16:55 -0700

Jeff Hartmann wrote:
> 
> Keith Whitwell wrote:
> 
> > Benjamin Herrenschmidt wrote:
> >
> >>> HOWEVER, if you tied the GART mapping to the DRM lock, you might be ok.
> >>> That gives you the required system exclusion, and if you make it an
> >>> explicit "get my GART context" function that is only called under
> >>> the DRM
> >>> lock _and_ only called when you actually need the AGP access, you also
> >>> avoid the unnecessary context switches.
> >>>
> >>> You might still have some performance issues simply because you
> >>> would do
> >>> extra work when switching aperture mappings, but hopefully the GART
> >>> switch
> >>> wouldn't be a common operation.
> >>>
> >>> The flexibility you would get _might_ be worth it.
> >>>
> >>
> >> Well, I would personally vote for the processes _not_ relying on having
> >> the AGP aperture mapped directly, but instead, the various memory pages
> >> making their AGP aperture. Several chipsets (Apple ones for sure, but it
> >> seems others are hitting this too nowadays) don't support AGP aperture
> >> accesses from the CPU.
> >
> >
> > What are you actually saying, that pages mapped in agp can't be
> > written by any means, or just that they can't be written through the
> > agp address range?
> >
> > It sounds kindof broken to me in any case.  How to mtrrs work in this
> > world?
> 
> Actually we should go to this model eventually.  However it needs me to
> have time to finish the Page Attribute Table support I started on at
> VA.  This allows write combining to be set on a per page basis, and is
> the direction we want to go even on x86.
> 
> >
> >
> >> That way, if you want several AGP contexts, you can have the processes
> >> tapping their AGP buffers without lock, locking would only be required
> >> once it's time to move one of these buffers in/out the physical GART
> >> under the arbitration of the DRM.
> >
> >
> > You don't need to lock to write to agp buffers in the current scheme.
> >
> > You also don't need to play with the gart table just to draw a
> > 2-triangle  strip.  On some chipsets, particularly under smp,
> > modifying the gart table is  very slow.  Ask Jeff about this.
> >
> > Keith
> >
>    This is also true, but I've done alot of heavy think on this very
> issue.  The key is to manage the agp aperture and only swap out regions
> when you absolutely have too.  The big key to getting something like
> this to work is a memory manager that every client uses, and is based on
> some sort of sarea.  It should be designed with a certain minimum block
> size, and have a few different flags for what kind of usage that memory
> block has.  (I can go into more detail on design, but you probably have
> a good idea what I mean here.)  Then the next step is to create kernel
> calls which can swap things to an from agp space and the card.  One
> cards that support it, another path (which prevents GART rewrites
> entirely) is to add support to swap to normal cached memory.
>    This is what I envision making sense in the long run.  A global
> memory manager using an sarea (doesn't have to be the main one) and a
> good aging mechanism get us most of the way there.


Jeff,

It might be helpful to clarify the different uses we are discussing WRT
to AGP.  In this thread so far, we've been jumping all over.  Here's a
shot at an AGP breakdown.  Feel free to correct my misconceptions.

1) The original utilization of AGP under Linux is faster MMIO
transactions than PCI.  Some level of improvement happens here by simply
accessing a device on an AGP bus, and no special AGP programming is
required.

2) Simple MMIO transactions can be optimized by enabling fast writes. 
This case is identical to the MMIO transactions in the first case, but
the bus and graphics chipset utilize hardware pipelining to increase
thrueput.  There is a penalty for turning the bus around
write/read/write/read because of the pipelining.  There are also certain
combinations of host chipsets and graphics chips where enabling fast
writes can cause hangs.

The remaining cases all utilize AGP bus mastering where the graphics
chip can read and write directly from AGP memory.
 
3) Static AGP Allocation.  This is the primary functionality that the
agpgart module provides today.  Physical memory is allocated by agpgart
as needed and that memory is managed on behalf of the user space and DRM
drivers at run time.  There is a finite amount of this memory available
dictated by the size of the AGP apperature (typically 64M).  We have not
fully exploited this case in user space, yet.  The prototype for the AGP
allocator and transfer mechanism of glDrawPixels in the Matrox G400
driver is a good example of the potential here.

4) Dynamic AGP Binding.  This functionality is spec'ed in the agpgart
interface but is not fully implemented, yet.  The intention is for user
space processes to be able to bind normal virtual pages to the AGP
apperature in a very dynamic fashion.  Some of the discussions about
binding and unbinding virtual memory make this option sound less
appealing.  Linus indicated it would probably be more efficient to just
copy the virtual memory to an uncached AGP page from the static
allocator case.

5) AGP Swapping w/ Graphics HW access only.  The hope here is that the
graphics hardware could somehow utilize more memory than could fit in
the apperature at any given time.  This would be useful for efficiently
swapping in and out large chunks of data only accessed by the graphics
hardware.  No need for VM access by the host, just a need to virtualize
many instances of this type of data when swapping between graphics
contexts (think private back buffers, extended texture store, etc).  If
a subset of the AGP apperature could be backed by a much larger number
of uncached system memory pages, this would be a very useful mechanism
indeed.  We could even utilize the drmLock to protect access to these
pages if that helped to minimize the TLB flush issues.  I don't know if
this can be done efficiently.

6) AGP Swapping w/ Host access to Swapped in pages.  Same as case 5, but
we would also like the host to be able to access the pages when they are
swapped in.  This case would make it possible to put AGP textures in the
swapped space.

7) AGP Swapping w/ Host access to all pages.  Same as case 5, but the
host would also be able to access all AGP pages regardless of whether
they are swapped in our not.  This is alot like the case 4, Dynamic AGP
Binding except the memory would be allocated from agpgart first.  I
don't know if accessing swapped out pages has any immediate value.

These are the cases I had in mind.  It seems we were crossing between
case 4, 5 and 6 in our discussion of AGP memory management.  I hope this
breakdown can add some clarity.

Regards,
Jens

--                             /\
         Jens Owen            /  \/\ _    
  [EMAIL PROTECTED]  /    \ \ \   Steamboat Springs, Colorado

----------------------------------------------------------------------------
                   Bringing you mounds of caffeinated joy
                   >>>     http://thinkgeek.com/sf    <<<

_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] R200 kernel interfaces

Reply via email to