Re: Aperture mapping under GEM

2008-08-03 Thread Dave Airlie
 
 Managing a fake linear address space just to match some existing
 arbitrary API requirements is insane. Creating the right interface for
 my UMA environment is my goal. I'm not sure precisely what that API
 should be, but at least this one is obviously wrong.

Isn't that also what you are trying to do with GEM though.. match GPU 
objects to the file interface. Now the thing is if you don't consider GTT 
mapping to be the same as normal mapping, you need an Intel specifc GTT 
map call, however that means a do_mmap you don't intend on ever changing 
to a real mmap call. Now you need to justify that to the vfs people.

I do wonder if you are better having an alternate open method that flags 
the mmap different, but that doesn;'t make much sense to me either. 
However creating new MAP_GTT means berakign the generic interface.

Dave.

 I want to handle thousands of discrete objects and be able to map them
 independently into my process, and bind them independently to the GTT.
 Only a few will ever be mapped to my process and while all of them will
 be bound to the GTT at times, only a subset will fit at any particular
 time.
 
 

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-03 Thread Thomas Hellström
Keith Packard wrote:

  and that's why TTM 
 needs to manage a fake linear address space for the drm fd.
 

 Managing a fake linear address space just to match some existing
 arbitrary API requirements is insane. Creating the right interface for
 my UMA environment is my goal. I'm not sure precisely what that API
 should be, but at least this one is obviously wrong.
   
I'm not sure I agree. 

What we're discussing is really per buffer object address space or per 
device address space.

With the current GEM implementation, the address space is per buffer 
object, and if this were done
correctly you'd duplicate the shmemfs filesystem to make a drmfs 
filesystem where you have complete control over creation and mmap-ing 
and do not need to create special cases to work around the shmemfs 
implementation. It's not impossible that you can overload the shmemfs 
mmap / fault methods of the shmemfs filesystem, but what you're 
suggesting isn't really what I'd refer to as the cleanest and most 
natural interface. Since you were asking for comments, I'd strongly 
recommend avoiding trying to manipulate ptes from the driver.

The other approach is to use one address space per device. An address 
space is obviously needed to be able to do unmap_mapping_range, read, 
write, seek etc. It's not an arbitrary API requirement. It's the linux 
file operations API requirement.  Since the address space is per device 
it needs to be managed. I see nothing wrong with that, except you don't 
get a filesystem entry per buffer, and you need to be aware what the 
limitations are: that the address space may become fragmented and 
resizing becomes complicated.

Given this, it's possible to make a choice what fits the driver best.
A lightweight driver that needs to manipulate ptes to account for 
caching and placement would probably use the latter method, which is 
what TTM currently does.

You've chosen the first and is faced with either

1) Hack ptes from the driver.
2) Try to overload the shmemfs mmap / fault methods.
3) Implement a new drmfs filesystem.

/Thomas








-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-03 Thread Keith Packard
On Sun, 2008-08-03 at 10:53 +0200, Thomas Hellström wrote:

 With the current GEM implementation, the address space is per buffer 
 object, and if this were done
 correctly you'd duplicate the shmemfs filesystem to make a drmfs 
 filesystem where you have complete control over creation and mmap-ing 
 and do not need to create special cases to work around the shmemfs 
 implementation.

I am not working around the shmem implementation at all; I'm using
regular shmem objects just as they are. The only thing I'm working
around at this point is the artificial kernel limit of 1024 fds. With
more fds, I could simply allocate shmem objects and pass them into my
environment.

Once the objects are allocated, I use regular kernel APIs to map those
pages to my device.

  I'd strongly 
 recommend avoiding trying to manipulate ptes from the driver.

I don't touch the shmem PTEs at all. The mapping I'm adding is entirely
separate from shmem and involves mapping portions of the GTT aperture
which just happen to contain pointers to shmem-allocated pages.

 The other approach is to use one address space per device.

This would require constructing an entirely artificial linear space for
my objects. You then have to track this per-device linear address for
each object and pass that into the mmap call. And, what does it mean
when you ask to mmap a range spanning multiple objects?

 1) Hack ptes from the driver.

Nope, not doing this; the GTT-based mapping would allocate separate PTEs
using the existing standard device mapping APIs.

 2) Try to overload the shmemfs mmap / fault methods.

I don't need to do this either; shmem handles its pages just fine.

 3) Implement a new drmfs filesystem.

I would prefer to use the existing shmem mechanisms instead of precisely
duplicating them.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-03 Thread Keith Packard
On Sun, 2008-08-03 at 08:07 +0100, Dave Airlie wrote:

 Isn't that also what you are trying to do with GEM though.. match GPU 
 objects to the file interface.

Yes, with a 1-1 mapping between GPU objects and file objects. You can
use the normal read/write/mmap API on them. The reason we aren't using
fds now is just that the kernel cannot handle this many fds per process.

  Now the thing is if you don't consider GTT 
 mapping to be the same as normal mapping, you need an Intel specifc GTT 
 map call,

I want to map these pages in two different ways, the first way is
through normal WB mapping which provides the expected memory semantics
(cached reads and writes). The second is to map them through the GTT
which offers two important benefits:

 1) WC mapping which avoids the need to clflush when passing data from
application to GPU.

 2) Linearized access to tiled surfaces. This uses the tile swizzling
HW in the GPU to construct a synthetic linear view of the tiled
surface which is currently required when doing SW rendering from
inside the X server.

  however that means a do_mmap you don't intend on ever changing 
 to a real mmap call. Now you need to justify that to the vfs people.

Nope, I can use a 'normal' mmap call and have two different address
ranges within my object, one which maps the pages directly and one which
maps them through the GTT. No flags needed here.

 I do wonder if you are better having an alternate open method that flags 
 the mmap different, but that doesn;'t make much sense to me either. 
 However creating new MAP_GTT means berakign the generic interface.

I want to allow the mapping type to be selected on a per-use basis, not
be an attribute of the file handle. I don't generally know up-front what
kind of mapping will be needed. I could have a magic 'dup' ioctl that
gave me a new FD that would do the new mapping type, and use that.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-03 Thread Thomas Hellström
Keith Packard wrote:

   
 The other approach is to use one address space per device.
 

 This would require constructing an entirely artificial linear space for
 my objects. You then have to track this per-device linear address for
 each object and pass that into the mmap call. And, what does it mean
 when you ask to mmap a range spanning multiple objects?
   
That's clearly an illegal operation and would return an error.
   
 1) Hack ptes from the driver.
 

 Nope, not doing this; the GTT-based mapping would allocate separate PTEs
 using the existing standard device mapping APIs.

   
But what happens when you unbind an object from the GTT while you map 
that data through the GTT? In your original email you stated that you'd 
walk through the VMAs and modify the PTEs.

If you want to avoid that, you need to run unmap_mapping_range() on an 
address space. What address space would that be?

/Thomas




-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-03 Thread Dave Airlie

 
 Yes, with a 1-1 mapping between GPU objects and file objects. You can
 use the normal read/write/mmap API on them. The reason we aren't using
 fds now is just that the kernel cannot handle this many fds per process.

Well it can now, we just need to fix the X server :)

 
 I want to map these pages in two different ways, the first way is
 through normal WB mapping which provides the expected memory semantics
 (cached reads and writes). The second is to map them through the GTT
 which offers two important benefits:
 
  1) WC mapping which avoids the need to clflush when passing data from
 application to GPU.
 
  2) Linearized access to tiled surfaces. This uses the tile swizzling
 HW in the GPU to construct a synthetic linear view of the tiled
 surface which is currently required when doing SW rendering from
 inside the X server.
 
   however that means a do_mmap you don't intend on ever changing 
  to a real mmap call. Now you need to justify that to the vfs people.
 
 Nope, I can use a 'normal' mmap call and have two different address
 ranges within my object, one which maps the pages directly and one which
 maps them through the GTT. No flags needed here.

Well bit-31 is now a flag, just under an assumed named with a fake 
passport. The question is whether this matters at all, or whether Intel 
driver can just do it that way and have intel specific hooks into the 
shmem mmap/fault code. For radeon this interface would suck, an 
object can be VRAM, main RAM, GTT, tiled, endian swapped, etc. but if I 
don't care about that, if Intel were to use mmap2 then in theory you could 
use an even higher bit than bit 31.

Dave

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-03 Thread Keith Packard
On Mon, 2008-08-04 at 05:13 +0100, Dave Airlie wrote:

 Well it can now, we just need to fix the X server :)

Yeah, I just discovered that today. Weird that the kernel was fixed
between the last time I looked and now though; NR_OPEN had been 1024 for
many years prior.

However, it's not just fixing the X server -- we'd have to fix every GL
application as well to not assume their fds were always in a narrow
range. Anyone care to wager how many 3D apps still use select?

We could get higher fds just by using dup2 and managing fds up in user
space. Making sure we didn't step on valid fds would be a pain. 

Plus, we're still stuck with increasing the max fd for each DRI
application. I'm sure a patch which had DRM increase this from inside
the kernel with no protections would be welcome by the kernel community.

 Well bit-31 is now a flag, just under an assumed named with a fake 
 passport. 

No argument; if there were a flag parameter to mmap, we'd just use it.
Given that we're using ioctls instead of raw syscalls, it seems like we
could just use a flag were it not for the lack of any additional
parameter to the underlying mmap fop.

Lacking this, we're stuck using a kludge (either fake linearized allocs
from the drm fd, or bit 31 on the gem object), or creating a separate
per-object fd (and underlying file/dentry/inode) for this other mapping.

Of these, the kludge plan seems more efficient, and I do prefer the
per-object kludge to the drm-fd kludge, but I'm not that tied to either;
the underlying code would all be the same, except for how to identify
which gem object the user was talking about.

 The question is whether this matters at all, or whether Intel 
 driver can just do it that way and have intel specific hooks into the 
 shmem mmap/fault code.

I don't think so; I can wrap the mmap fop easily enough and substitute
my own vma initialization. To invalidate the mapping after pulling the
object from the GTT, it looks like zap_page_range will work, then my
fault handler would get called on access to bind back to the GTT and
re-validate the map. Or so it seems to me; I haven't tried it yet, and I
won't have time to do that for a couple of weeks.

  For radeon this interface would suck, an 
 object can be VRAM, main RAM, GTT, tiled, endian swapped, etc.

We pass tile information into the kernel for our objects now; we assume
that the GTT map user wants a linear view of the object suitable for
plain old fb drawing. The only semantic distinction between the regular
mmap and the GTT mmap is this linearization of tiled objects; the WC
mapping doesn't affect how things work, only how fast each read/write
operation is, and whether the kernel will be doing additional CPU cache
flushing.

  but if I 
 don't care about that, if Intel were to use mmap2 then in theory you could 
 use an even higher bit than bit 31.

Yeah, someday we'll need to deal with single objects larger than 2GB.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-02 Thread Thomas Hellström
Zhao, Chunfeng wrote:
 Hi Keith,
 Do we have a time line to merge DRM modesetting_GEM branch to upstream
 main line branch?

 Thanks!

 Chunfeng

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Keith
 Packard
 Sent: Thursday, July 31, 2008 9:18 PM
 To: dri-devel
 Cc: [EMAIL PROTECTED]
 Subject: Aperture mapping under GEM

 Ok, we clearly need to deal with mapping subsets of the graphics
 aperture, both for discrete graphics cards and for 2D on tiled surfaces.
 Plus, there are reasons for using WC object mappings which is easily
 done through the aperture.

 I haven't spend a huge amount of time thinking about this, but I figured
 I'd prod people into discussion to try and sort things out.

 First off, here's what I think I want.

 We expose mmap ioctls on the gem objects, and I'd like to use the same
 basic mechanism; when (if?) gem objects become real files, we would
 want to continue using the same interface. I suggest creating two mmap
 windows for main memory objects:

 0x-0x7fff: map the backing pages directly
 0x8000-0x: map the object through the aperture

 I don't quite know what to do with discrete card memory; suggestions
 here are welcome from people who've thought about this more than I.

 Using these two per-object windows means there isn't any need to manage
 a synthetic linear address space for some global object (like the DRM
 fd).

 Next, we need to hook the mmap path in the driver so that our code can
 get a chance to play. I attached something that might work.

 Once we've got an mmap request, here's what I think we want to do:

  1. Detect an aperture mapping request (bit 31)
  2. Map the object to the aperture (speculating that the app will
 actually use it)
  3. Initialize the vma to point at the aperture physical address
 range

 If the object remains mapped to the GTT, there's nothing else to do
 until the unmap request comes along at which point we tear down the vma.

 If the object gets unmapped from the GTT, we need to go find every VMA
 mapping it and fix up their PTEs to be unreadable/writable. I'm hoping
 this won't kill performance, but I'm fairly sure this will require an
 IPI to get the TLBs flushed on every core. Right? At least there won't
 be a cache flush as well.

 Now, if the application touches any one of those pages, we should map
 the whole object back to the GTT and rewrite the PTEs again. We could do
 this a page at a time, but I don't see any real benefit as we have to
 allocate the aperture space anyways, and it shouldn't be that much more
 expensive to fix up a lot of PTEs than to fix up just one.

 I think that's the whole story here; am I missing any big pieces?

   
Keith,

The description would be a little easier to follow if you didn't use the 
term map both for
mmap-ing and AGP binding.

Anyway, the above would probably work but for Intel UMA only,
as other driver writers would have to deal with switching caching policy 
and VRAM copies as well, and either not use shmem objects or 
short-circuit their mapping / fault methods.

The Linux mm people are very strongly against having a driver 
manipulating ptes directly. For this reason, one could use 
unmap_mapping_range() to invalidate all user ptes pointing to a 
particular range in the address space of an object, and that's why TTM 
needs to manage a fake linear address space for the drm fd.

/Thomas







-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-02 Thread Keith Packard
On Sat, 2008-08-02 at 17:01 +0200, Thomas Hellström wrote:

 The description would be a little easier to follow if you didn't use the 
 term map both for
 mmap-ing and AGP binding.

Yeah, using unique terms for each map is a good idea.

 Anyway, the above would probably work but for Intel UMA only,
 as other driver writers would have to deal with switching caching policy 
 and VRAM copies as well, and either not use shmem objects or 
 short-circuit their mapping / fault methods.

This is for the Intel driver, which is UMA only.

 The Linux mm people are very strongly against having a driver 
 manipulating ptes directly.

I'm always interested in coming up with the cleanest and most natural
interface, independent of arbitrary objections.

  and that's why TTM 
 needs to manage a fake linear address space for the drm fd.

Managing a fake linear address space just to match some existing
arbitrary API requirements is insane. Creating the right interface for
my UMA environment is my goal. I'm not sure precisely what that API
should be, but at least this one is obviously wrong.

I want to handle thousands of discrete objects and be able to map them
independently into my process, and bind them independently to the GTT.
Only a few will ever be mapped to my process and while all of them will
be bound to the GTT at times, only a subset will fit at any particular
time.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


RE: Aperture mapping under GEM

2008-08-01 Thread Zhao, Chunfeng
Hi Keith,
Do we have a time line to merge DRM modesetting_GEM branch to upstream
main line branch?

Thanks!

Chunfeng

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Keith
Packard
Sent: Thursday, July 31, 2008 9:18 PM
To: dri-devel
Cc: [EMAIL PROTECTED]
Subject: Aperture mapping under GEM

Ok, we clearly need to deal with mapping subsets of the graphics
aperture, both for discrete graphics cards and for 2D on tiled surfaces.
Plus, there are reasons for using WC object mappings which is easily
done through the aperture.

I haven't spend a huge amount of time thinking about this, but I figured
I'd prod people into discussion to try and sort things out.

First off, here's what I think I want.

We expose mmap ioctls on the gem objects, and I'd like to use the same
basic mechanism; when (if?) gem objects become real files, we would
want to continue using the same interface. I suggest creating two mmap
windows for main memory objects:

0x-0x7fff: map the backing pages directly
0x8000-0x: map the object through the aperture

I don't quite know what to do with discrete card memory; suggestions
here are welcome from people who've thought about this more than I.

Using these two per-object windows means there isn't any need to manage
a synthetic linear address space for some global object (like the DRM
fd).

Next, we need to hook the mmap path in the driver so that our code can
get a chance to play. I attached something that might work.

Once we've got an mmap request, here's what I think we want to do:

 1. Detect an aperture mapping request (bit 31)
 2. Map the object to the aperture (speculating that the app will
actually use it)
 3. Initialize the vma to point at the aperture physical address
range

If the object remains mapped to the GTT, there's nothing else to do
until the unmap request comes along at which point we tear down the vma.

If the object gets unmapped from the GTT, we need to go find every VMA
mapping it and fix up their PTEs to be unreadable/writable. I'm hoping
this won't kill performance, but I'm fairly sure this will require an
IPI to get the TLBs flushed on every core. Right? At least there won't
be a cache flush as well.

Now, if the application touches any one of those pages, we should map
the whole object back to the GTT and rewrite the PTEs again. We could do
this a page at a time, but I don't see any real benefit as we have to
allocate the aperture space anyways, and it shouldn't be that much more
expensive to fix up a lot of PTEs than to fix up just one.

I think that's the whole story here; am I missing any big pieces?

-- 
[EMAIL PROTECTED]

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-01 Thread Keith Packard
On Fri, 2008-08-01 at 18:48 +0200, Jakob Bornecrantz wrote:

 The basic fault here is that you have added a driver specific flag to a 
 generic
 ioctl/syscall. Which the last time I checked we didn't want. For example on
 PCIE Radeon there is no GTT to map, so bit 31 makes no sense there.

The GEM MMAP ioctl is driver-specific, not generic for precisely this
reason.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


RE: Aperture mapping under GEM

2008-08-01 Thread Keith Packard
On Fri, 2008-08-01 at 10:45 -0700, Zhao, Chunfeng wrote:
 Hi Keith,
 Do we have a time line to merge DRM modesetting_GEM branch to upstream
 main line branch?

Eric has posted the GEM patches to lkml for review; there are external
kernel changes which are necessary for GEM to work; I think that blocks
having GEM appear in the DRM master branch.

Jesse is working on rebasing KMS to GEM, but he's not yet comfortable
moving that to master.

In any case, if you look at Jesse's proposed 2.5 release plans (visible
through http://planet.freedesktop.org), you'll see that we expect all of
this to be available for our Q3 release which will occur at the end of
September. For that to happen, everything will be merged to the suitable
master upstream branches.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-01 Thread Jakob Bornecrantz
On Fri, Aug 1, 2008 at 8:13 PM, Keith Packard [EMAIL PROTECTED] wrote:
 On Fri, 2008-08-01 at 18:48 +0200, Jakob Bornecrantz wrote:

 The basic fault here is that you have added a driver specific flag to a 
 generic
 ioctl/syscall. Which the last time I checked we didn't want. For example on
 PCIE Radeon there is no GTT to map, so bit 31 makes no sense there.

 The GEM MMAP ioctl is driver-specific, not generic for precisely this
 reason.

If you want a none generic ioctl for that function go ahead, but IHMO
it should then be some sort of flag field on the request. Fiddling
with bits on the address feels a bit icky at best.

But, the last time I check the only reason you could even hope to get
a mmap ioctl into mainline was under the provision that you later
moved it to the mmap syscall, which is however generic.

Cheers Jakob.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-01 Thread Kristian Høgsberg
On Fri, Aug 1, 2008 at 2:13 PM, Keith Packard [EMAIL PROTECTED] wrote:
 On Fri, 2008-08-01 at 18:48 +0200, Jakob Bornecrantz wrote:

 The basic fault here is that you have added a driver specific flag to a 
 generic
 ioctl/syscall. Which the last time I checked we didn't want. For example on
 PCIE Radeon there is no GTT to map, so bit 31 makes no sense there.

 The GEM MMAP ioctl is driver-specific, not generic for precisely this
 reason.

I think Jakob has a point though.  From your first post in this thread:

We expose mmap ioctls on the gem objects, and I'd like to use the same
basic mechanism; when (if?) gem objects become real files, we would
want to continue using the same interface. I suggest creating two mmap
windows for main memory objects:

Are you saying that you're not planning to make the mmap ioctl a real
mmap syscall when/if that's feasible or that it's okay to add
intel-gem specific bits to the mmap arguments?  I recall Thomas asking
for a flags argument to the GEM create ioctl...

cheers,
Kristian

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-01 Thread Keith Packard
On Fri, 2008-08-01 at 20:33 +0200, Jakob Bornecrantz wrote:

 If you want a none generic ioctl for that function go ahead, but IHMO
 it should then be some sort of flag field on the request. Fiddling
 with bits on the address feels a bit icky at best.

Yeah, it is a bit icky. The thing is that with a file object, you've got
one linear address space, so you can't really mmap the same address
space in two different ways. One alternative here is to create another
file object for the same pages and use different mmap semantics there,
but I'd prefer to avoid that as it will be fairly expensive in kernel
memory.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Aperture mapping under GEM

2008-08-01 Thread Keith Packard
On Fri, 2008-08-01 at 14:34 -0400, Kristian Høgsberg wrote:

 Are you saying that you're not planning to make the mmap ioctl a real
 mmap syscall when/if that's feasible or that it's okay to add
 intel-gem specific bits to the mmap arguments?  I recall Thomas asking
 for a flags argument to the GEM create ioctl...

Note that there aren't intel specific bits here, the intel back-end just
has two separate address space ranges which expose different mappings.
So, it could still be managed through the regular mmap API. However, it
does seem a bit kludgy, and it might be better to have separate flags. 

However, it also seems odd to create two different mappings to the same
address, that have different cache behaviour -- technically, this isn't
valid for Intel PTEs, but as one mapping goes through the GTT, the
underlying physical address seen by the CPU differs.

Also, I don't know enough about the linux mmap implementation to say
whether it will do 'odd' things with vmas which map the same address
range in a file. Using separate address ranges means I can see the
difference down in my mmap driver entry point, which seems like a
feature. Alternate suggestions are welcome, especially if they point to
a potential underlying implementation.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Aperture mapping under GEM

2008-07-31 Thread Keith Packard
Ok, we clearly need to deal with mapping subsets of the graphics
aperture, both for discrete graphics cards and for 2D on tiled surfaces.
Plus, there are reasons for using WC object mappings which is easily
done through the aperture.

I haven't spend a huge amount of time thinking about this, but I figured
I'd prod people into discussion to try and sort things out.

First off, here's what I think I want.

We expose mmap ioctls on the gem objects, and I'd like to use the same
basic mechanism; when (if?) gem objects become real files, we would
want to continue using the same interface. I suggest creating two mmap
windows for main memory objects:

0x-0x7fff: map the backing pages directly
0x8000-0x: map the object through the aperture

I don't quite know what to do with discrete card memory; suggestions
here are welcome from people who've thought about this more than I.

Using these two per-object windows means there isn't any need to manage
a synthetic linear address space for some global object (like the DRM
fd).

Next, we need to hook the mmap path in the driver so that our code can
get a chance to play. I attached something that might work.

Once we've got an mmap request, here's what I think we want to do:

 1. Detect an aperture mapping request (bit 31)
 2. Map the object to the aperture (speculating that the app will
actually use it)
 3. Initialize the vma to point at the aperture physical address
range

If the object remains mapped to the GTT, there's nothing else to do
until the unmap request comes along at which point we tear down the vma.

If the object gets unmapped from the GTT, we need to go find every VMA
mapping it and fix up their PTEs to be unreadable/writable. I'm hoping
this won't kill performance, but I'm fairly sure this will require an
IPI to get the TLBs flushed on every core. Right? At least there won't
be a cache flush as well.

Now, if the application touches any one of those pages, we should map
the whole object back to the GTT and rewrite the PTEs again. We could do
this a page at a time, but I don't see any real benefit as we have to
allocate the aperture space anyways, and it shouldn't be that much more
expensive to fix up a lot of PTEs than to fix up just one.

I think that's the whole story here; am I missing any big pieces?

-- 
[EMAIL PROTECTED]
commit 0eb8c53640406c08b5a304d09bf08079b53eef84
Author: Keith Packard [EMAIL PROTECTED]
Date:   Tue Jul 29 20:19:28 2008 -0700

Start adding gtt mapping ioctls

diff --git a/linux-core/i915_gem.c b/linux-core/i915_gem.c
index 236203a..f187361 100644
--- a/linux-core/i915_gem.c
+++ b/linux-core/i915_gem.c
@@ -85,6 +85,23 @@ i915_gem_init_ioctl(struct drm_device *dev, void *data,
 }
 
 
+static struct file_operations i915_gem_file_operations;
+
+#define I915_GEM_MAP_GTT_BASE	(1  31)
+
+static int i915_gem_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct drm_device *dev = file-private_data;
+	drm_i915_private_t *dev_priv = dev-dev_private;
+	
+	DRM_INFO(mmap %08lx\n, vma-vm_start);
+	if (vma-vm_start  I915_GEM_MAP_GTT_BASE)
+		return -ENODEV;
+	else
+		return dev_priv-shmem_mmap (file, vma);
+}
+
+
 /**
  * Creates a new mm object and returns a handle to it.
  */
@@ -103,6 +120,16 @@ i915_gem_create_ioctl(struct drm_device *dev, void *data,
 	if (obj == NULL)
 		return -ENOMEM;
 
+	obj-filp-private_data = dev;
+	spin_lock(dev-object_name_lock);
+	if (i915_gem_file_operations.mmap == NULL) {
+		dev-shmem_mmap = obj-filp-f_path.dentry-d_inode-i_fop-mmap;
+		i915_gem_file_operations = *obj-filp-f_path.dentry-d_inode-i_fop;
+		i915_gem_file_operations.mmap = i915_gem_mmap;
+	}
+	obj-filp-f_path.dentry-d_inode-i_fop = i915_gem_file_operations;
+	spin_unlock(dev-object_name_lock);
+
 	ret = drm_gem_handle_create(file_priv, obj, handle);
 	mutex_lock(dev-struct_mutex);
 	drm_gem_object_handle_unreference(obj);
diff --git a/shared-core/i915_drv.h b/shared-core/i915_drv.h
index a9a431c..a577292 100644
--- a/shared-core/i915_drv.h
+++ b/shared-core/i915_drv.h
@@ -321,6 +321,9 @@ typedef struct drm_i915_private {
 		uint32_t bit_6_swizzle_x;
 		/** Bit 6 swizzling required for Y tiling */
 		uint32_t bit_6_swizzle_y;
+
+		/** shmem_mmap isn't public, but we discover it by magic */
+		int (*shmem_mmap) (struct file *file, struct vm_area_struct *vma);
 	} mm;
 } drm_i915_private_t;
 


signature.asc
Description: This is a digitally signed message part
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel