Re: [Dri-devel] R200 kernel interfaces

Benjamin Herrenschmidt Tue, 18 Jun 2002 13:44:06 -0700

>
>
>On Mon, 17 Jun 2002, Benjamin Herrenschmidt wrote:
>>
>> >    mmap() and AGP driver gives access to IOMEM/AGP
>>
>> That one is problematic. I don't support the mmap interface properly
>> on Apple chipsets for example, because they don't support the AGP
>> aperture beeing accessed by the CPU.
>
>I assume you mean that the CPU doesn't honour the AGP mappings, but the
>CPU _can_ access the physical pages themselves.


Of course, it would be pretty useless if the aperture couldn't be used
at all ;) Sorry for the confusion.

>How do you do it right
>now, since we seem to be doing "ioremap_nocache()" all over the place with
>the AGP aperture?

Not that much. I really only bothered with the r128 and radeon drivers,
the kernel side use a few well localized ioremap calls that I turned
into an agp_ioremap call. I then implement that function by directly
building a virtual mapping from the underlying AGP pages.
The userland side is using drmMap, which already sets up some vmops
for various cases, I just had to add a specific case for this kind of
AGP hosts that give the real RAM pages on the fly.

>But fundamentally that should not be a problem: we can map the (unmapped)
>AGP pages one page at a time (rather than as one contiguous block of
>remapped pages) into user mode.

Using vmops is easy, provided that nobody plays tricks like binding/unbinding
memory from the GART behind out back. That is my main problem with the
current ioctl interface to agpgart. Basically, the API allows this and
the agptest program itself will, for example, map the aperture before
binding memory to it. It's still workable provided the aperture isn't
accessed before memory is bound, but if we allow dynamic binding/unbinding
of memory while the entire aperture is mmap'ed in some process space,
then we have to potentially tear down mappings of those other processes
on unbind, which is beyond my knowledge of linux vm (especially on SMP).

>I thought AGP already supported a mmap() interface, and if it really
>doesn't, it should be trivial to do...
>
> [ Time passes, Linus looks at the sources ]
>
>Ok, there does seem to be mmap() support in the AGP module, but it seems
>to use that stupid "remap_page_range()" and the AGP base (similar to
>ioremap() inside the kernel), so it does seem to mmap the _mapped_ AGP
>area.
>
>It would be possible to just install a "nopage" handler, and map one page
>at a time on demand from the pool of (non-GART-mapped) pages that we keep
>in the gatt_table[] or whatever.

Yup, exactly like what I do with drmMap, but I still don't like the API
for the reason I just explained.

>Maybe there is some reason for doing it that way that I don't understand.
>More likely, it's just done that way because it was the simple and stupid
>approach.
>
>However, you seem to prefer a different approach, which would certainly
>work:
>
>> I would much prefer the agpgart interface to be redisigned around
>> different semantics, mostly vmalloc() some space to use as AGP memory,
>> then bind that to the GART, but don't rely on direct AGP aperture
>> access.
>>
>> There are also some slight speed improvements to win using this
>> sheme as I could map the AGP memory as cacheable (which would give a
>> significant boost on PPC) provided buffers & ring get properly flushed
>> before beeing "passed" to the chip.
>
>Hmm.. It would be fairly simple to do all page allocation in user space,
>and have an interface that says "put the physical page corresponding to my
>virtual address xxxx into the AGP aperture at offset yyyy".
>
>This would effectively disallow the above "map by unmapped page" approach,
>because it's too damn expensive to find and flush any existing mappings
>when somebody maps in a new page. And if not all systems support the
>GART-assisted CPU mapping that we do now, that means that nobody can mmap
>the AGP area into memory.

Exactly.

>The expensive part would be the "mark this page uncacheable" when moving
>it to the AGP buffer, which implies a cross-CPU TLB flush for each such
>page. So moving a page into the AGP aperture is fundamentally a fairly
>expensive operation: wbinvd itself takes a _loong_ time, but if you have
>to do it on all CPU's along with the TLB flush, it gets _really_
>expensive.
>
>So moving pages that way is definitely not cheap either. Hmm.

What about simpler semantics ? A given client need well known chunks of
AGP memory (the ring buffer, the indirect buffers, etc...). All we really
need is _one_ call to allocate a chunk of memory and bind it to the GART.
That call would return whatever opaque ID that can be used for processes
to later mmap that into their space and the offset into AGP aperture where
it was bound.
Of course, we need to provide the opposite call for disposing of it, but in
this case, we probably don't need to be smart regarding processes who still
have it mapped as it's typically a fatal thing or programming error.

This avoids the problem with current agpgart which is to allow more or
less random binding/unbinding of memory while processes have the whole
aperture mapped. You just allocate & map the chunks you acutally need.


Ben.




----------------------------------------------------------------------------
                   Bringing you mounds of caffeinated joy
                   >>>     http://thinkgeek.com/sf    <<<

_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] R200 kernel interfaces

Reply via email to