Re: [git pull] drm request 3

2010-03-05 Thread David Miller
From: Alan Cox a...@lxorguk.ukuu.org.uk
Date: Fri, 5 Mar 2010 12:38:34 +

 The conclusion is crystal clear, breaking an ABI via a flag day 
 cleanup/feature/etc is:
 
 Ingo go read the staging Kconfig. It's crystal clear, and lots of vendor
 junk that is in there being cleaned up it would be *insane* to keep their
 old APIs
 
 See there's a bigger offence than breaking an ABI - its called not RTFM.

All of this RTFM and what directory the noveau driver is sitting in is
entirely irrelevant Alan.

If it effects such a large number of people, which this noveau thing
does, it's entirely relevant to everyone.  And the way it's breaking
and making kernel development difficult for so many people matters to
us.

It's about the tester base, and this breakage shrinks the tester base
considerably.

Or do you want the kernel tested by less people?

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [git pull] drm request 3

2010-03-05 Thread David Miller
From: Alan Cox a...@lxorguk.ukuu.org.uk
Date: Fri, 5 Mar 2010 15:09:34 +

 I think you miss a bigger picture ?
 
 If Fedora hadn't merged it then it wouldn't have gotten to the state of
 usability it had. If Fedora hadn't merged it then several hundred
 thousand users wouldn't have had useful working machines.

I think Fedora were right to merge it, and I think it was proper to
merge it into the upstream kernel.

But I also think the large size of that userbase should have been
respected by doing the right thing here, and that means not making
it so hard for Fedora users to use upstream kernels out of the box.

Making the sandbox claim cuts both ways Alan.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [git pull] drm request 3

2010-03-05 Thread David Miller
From: Daniel Stone dan...@fooishbar.org
Date: Fri, 5 Mar 2010 17:17:54 +0200

 On Fri, Mar 05, 2010 at 06:37:18AM -0800, David Miller wrote:
 If it effects such a large number of people, which this noveau thing
 does, it's entirely relevant to everyone.  And the way it's breaking
 and making kernel development difficult for so many people matters to
 us.
 
 Maybe the lesson to be learned from all this is, 'if the developers
 don't want something merged because they're not ready and forsee huge
 problems in the future, actually listen to them instead of blindly
 ramming it in regardless'? But maybe that's just me.

That's doesn't work, and it never will.

First of all, if we didn't merge the driver Fedora users wouldn't be
able to test the upstream kernel at all.

And if you think things through, there is one and onle one set of
actions that would have made things work properly.

And that's merge the driver upstream and not break user visible APIs.

In fact, I argue that the moment nouveau went into Fedora and
was turned on by default, the interfaces needed to be frozen.

Consider if it didn't go upstream and I want to do upstream
kernel development, ok so I patch the noveau-of-the-moment into
my upstream tree.

Six months and 10 DRM library updates later I go back and try to boot
that kernel.  And it's not going to work.

So if the user visible APIs are changed in any set of situations
(upstream merged, not upstream merged, etc.) things can end up
breaking.

Ergo, you simply can't sanely do it at all.  You have to have
a compatability story when you change these things.

Personally I wouldn't have ever committed to that user visible APIs
can break cause it's in -stable.  Because that's complete garbage.


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [git pull] drm request 3

2010-03-05 Thread David Miller
From: Daniel Stone dan...@fooishbar.org
Date: Fri, 5 Mar 2010 17:40:09 +0200

 On Fri, Mar 05, 2010 at 07:26:12AM -0800, David Miller wrote:
 In fact, I argue that the moment nouveau went into Fedora and
 was turned on by default, the interfaces needed to be frozen.
 
 That's a matter for the Fedora kernel team; for better or worse, they
 made the choices they did, which included going through all the relevant
 pain to support this.  They didn't consider it suitable for upstream
 because they didn't think everyone else should be forced to endure that
 pain.

By not merging it upstream the pain is larger not smaller.

It's enabled by default, so you therefore can't test upstream kernels
by default.

And as I showed already, even if you jump through the hoops to make it
work (building noveau from out of tree in the upstream kernel) you'll
end up getting screwed when the API changes anyways.

Using VESA or whatever else you've suggested is just not a reasonable
alternative.

You can't unleash something like this on a userbase of this magnitude
and then throw your hands up in the air and say I'm not willing to
support this in a reasonable way.

We're better than that.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Making Xorg easier to test

2010-03-05 Thread David Miller
From: Daniel Stone dan...@fooishbar.org
Date: Fri, 5 Mar 2010 17:41:43 +0200

 I understand that you guys are upset about this, so maybe you'd like to
 donate, say, 10% of your developer base to help out? That'd be pretty
 ace.

You have to support less than %10 of the amount of hardware we have to
support.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [git pull] drm request 3

2010-03-05 Thread David Miller
From: Alan Cox a...@lxorguk.ukuu.org.uk
Date: Fri, 5 Mar 2010 16:02:17 +

 You can't unleash something like this on a userbase of this magnitude
 and then throw your hands up in the air and say I'm not willing to
 support this in a reasonable way.
 
 Not to belabour the obvious - they didn't. Linus ordered them to merge it.

And I'm arguing not merging it upstream would be like saying
the same thing.

 We're better than that.
 
 If you consider the problem is with the Fedora kernel team decision to
 merge it into Fedora in the first place

For the second time Alan, I don't.

I think Fedora should have merged it.

I think it should be upstream.

And I think the API bustage needs to be avoided.


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [git pull] drm request 3

2010-03-05 Thread David Miller
From: Daniel Stone dan...@fooishbar.org
Date: Fri, 5 Mar 2010 18:04:34 +0200

 So you're saying that there's no way to develop any reasonable body of
 code for the Linux kernel without committing to keeping your ABI
 absolutely rock-solid stable for eternity, no exceptions, ever? Cool,
 that worked really well for Xlib.

read() still works the same way it did 30 years ago last time I
checked.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Making Xorg easier to test

2010-03-05 Thread David Miller
From: Xavier Bestel xavier.bes...@free.fr
Date: Fri, 05 Mar 2010 18:50:24 +0100

 On Fri, 2010-03-05 at 07:49 -0800, David Miller wrote:
 From: Daniel Stone dan...@fooishbar.org
 Date: Fri, 5 Mar 2010 17:41:43 +0200
 
  I understand that you guys are upset about this, so maybe you'd like to
  donate, say, 10% of your developer base to help out? That'd be pretty
  ace.
 
 You have to support less than %10 of the amount of hardware we have to
 support.
 
 You can't compare a network card and a GPU. The latter is way more
 complex to code for.

I wasn't talking specifically about network cards.  But if you want
specific examples...

How about the x86 or parisc cpu, and all their derivatives, are those
complex enough for you? :-)

I've worked on OpenGL capable grapics card drivers of various
vintages, and I honestly don't think there is anything in there more
complex than what we have to deal with in the kernel.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Making Xorg easier to test

2010-03-05 Thread David Miller
From: Jesse Barnes jbar...@virtuousgeek.org
Date: Fri, 5 Mar 2010 10:02:44 -0800

 So from that perspective, the graphics stack is the most complex one in
 Linux by a long shot.  It's even worse than if we had STREAMS
 networking with a ton of different modules up in userspace messing with
 protocol. :)

Maybe :-)

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [git pull] drm

2009-12-11 Thread David Miller
From: Alan Cox a...@lxorguk.ukuu.org.uk
Date: Fri, 11 Dec 2009 09:18:43 +

 However the fundamental point stands. The only people who can sign it off
 are the people who wrote it. Those are the rules. Red Hat didn't write the
 code, Red Hat cannot sign it off however much you rant at them. You also
 previously said you don't want to merge stuff when the authors don't want
 it merged.

I agree with a lot of what you say.

However, one point remains is that we were told, by Dave Airlie, that
they didn't want this code merged because the one person being paid to
work on it would be overwhelmed if the code went upstream.

I distinctly remember this being mentioned at the kernel summit.

And you know what?  That kind of excuse pisses me off too :-)

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-29 Thread David Miller
From: Heiko Carstens heiko.carst...@de.ibm.com
Date: Thu, 29 Oct 2009 09:34:16 +0100

 Subject: [PATCH] fs: add missing compat_ptr handling for FS_IOC_RESVSP ioctl
 
 From: Heiko Carstens heiko.carst...@de.ibm.com
 
 For FS_IOC_RESVSP and FS_IOC_RESVSP64 compat_sys_ioctl() uses its
 arg argument as a pointer to userspace. However it is missing a
 a call to compat_ptr() which will do a proper pointer conversion.
 
 This was introduced with 3e63cbb1 fs: Add new pre-allocation ioctls
 to vfs for compatibility with legacy xfs ioctls.
 
 Cc: Ankit Jain m...@ankitjain.org
 Cc: Christoph Hellwig h...@lst.de
 Cc: Al Viro v...@zeniv.linux.org.uk
 Cc: Arnd Bergmann arndbergm...@googlemail.com
 Reported-by: David Miller da...@davemloft.net
 Signed-off-by: Heiko Carstens heiko.carst...@de.ibm.com

Acked-by: David S. Miller da...@davemloft.net

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-28 Thread David Miller
From: Dave Airlie airl...@linux.ie
Date: Wed, 28 Oct 2009 05:42:27 + (GMT)

 I'll add this to my TODO for before the next merge window as its
 definitely more than I can manage now.

I'll do it.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-28 Thread David Miller
From: David Miller da...@davemloft.net
Date: Tue, 27 Oct 2009 23:04:50 -0700 (PDT)

 From: Dave Airlie airl...@linux.ie
 Date: Wed, 28 Oct 2009 05:42:27 + (GMT)
 
 I'll add this to my TODO for before the next merge window as its
 definitely more than I can manage now.
 
 I'll do it.

Ok, everything was straightforward except for radeon_cs, which
has three levels of indirection for the tables of data and
relocation chunks userland can pass into the DRM :-/

There is no way to isolate the compat handling without parsing
and bringing the pointer array and the chunk headers into the
kernel twice.

So I guess I'm willing to capitulate with is_compat_task() checks in
radeon_cs_init(), but if you want to improve this interface I see two
ways:

1) Eliminate one level of indirection so there is just a straight
   scatter-gather list at one level.

2) Use a base pointer and a set of offsets to communicate at least
   one level of indirection.

Both schemes should satisfy the buffering flexibility these
interfaces seem to be geared towards giving to userland.

Anyways the following patch is tested and seems to work well.

Signed-off-by: David S. Miller da...@davemloft.net

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 5ab2cf9..ba44eba 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -24,6 +24,8 @@
  * Authors:
  *Jerome Glisse gli...@freedesktop.org
  */
+#include linux/compat.h
+
 #include drmP.h
 #include radeon_drm.h
 #include radeon_reg.h
@@ -107,7 +109,12 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void 
*data)
if (p-chunks_array == NULL) {
return -ENOMEM;
}
-   chunk_array_ptr = (uint64_t *)(unsigned long)(cs-chunks);
+#ifdef CONFIG_COMPAT
+   if (is_compat_task())
+   chunk_array_ptr = compat_ptr((compat_uptr_t) cs-chunks);
+   else
+#endif
+   chunk_array_ptr = (uint64_t *)(unsigned long)(cs-chunks);
if (DRM_COPY_FROM_USER(p-chunks_array, chunk_array_ptr,
   sizeof(uint64_t)*cs-num_chunks)) {
return -EFAULT;
@@ -120,9 +127,15 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void 
*data)
for (i = 0; i  p-nchunks; i++) {
struct drm_radeon_cs_chunk __user **chunk_ptr = NULL;
struct drm_radeon_cs_chunk user_chunk;
-   uint32_t __user *cdata;
 
-   chunk_ptr = (void __user*)(unsigned long)p-chunks_array[i];
+#ifdef CONFIG_COMPAT
+   if (is_compat_task())
+   chunk_ptr = compat_ptr((compat_uptr_t)
+  p-chunks_array[i]);
+   else
+#endif
+   chunk_ptr = (void __user*) (unsigned long)
+   p-chunks_array[i];
if (DRM_COPY_FROM_USER(user_chunk, chunk_ptr,
   sizeof(struct drm_radeon_cs_chunk))) {
return -EFAULT;
@@ -142,9 +155,16 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void 
*data)
}
 
p-chunks[i].length_dw = user_chunk.length_dw;
-   p-chunks[i].user_ptr = (void __user *)(unsigned 
long)user_chunk.chunk_data;
+#ifdef CONFIG_COMPAT
+   if (is_compat_task())
+   p-chunks[i].user_ptr =
+   compat_ptr((compat_uptr_t)
+  user_chunk.chunk_data);
+   else
+#endif
+   p-chunks[i].user_ptr = (void __user *) (unsigned long)
+   user_chunk.chunk_data;
 
-   cdata = (uint32_t *)(unsigned long)user_chunk.chunk_data;
if (p-chunks[i].chunk_id != RADEON_CHUNK_ID_IB) {
size = p-chunks[i].length_dw * sizeof(uint32_t);
p-chunks[i].kdata = kmalloc(size, GFP_KERNEL);
diff --git a/drivers/gpu/drm/radeon/radeon_ioc32.c 
b/drivers/gpu/drm/radeon/radeon_ioc32.c
index a1bf11d..3c4dfa2 100644
--- a/drivers/gpu/drm/radeon/radeon_ioc32.c
+++ b/drivers/gpu/drm/radeon/radeon_ioc32.c
@@ -156,7 +156,7 @@ static int compat_radeon_cp_stipple(struct file *file, 
unsigned int cmd,
 typedef struct drm_radeon_tex_image32 {
unsigned int x, y;  /* Blit coordinates */
unsigned int width, height;
-   u32 data;
+   compat_uptr_t data;
 } drm_radeon_tex_image32_t;
 
 typedef struct drm_radeon_texture32 {
@@ -165,7 +165,7 @@ typedef struct drm_radeon_texture32 {
int format;
int width;  /* Texture image coordinates */
int height;
-   u32 image;
+   compat_uptr_t image;
 } drm_radeon_texture32_t;
 
 static int compat_radeon_cp_texture(struct file *file, unsigned int cmd,
@@ -180,8 +180,7 @@ static int compat_radeon_cp_texture(struct file *file, 
unsigned int cmd,
return -EFAULT

Re: is avoiding compat ioctls possible?

2009-10-28 Thread David Miller
From: Andi Kleen a...@firstfloor.org
Date: Wed, 28 Oct 2009 08:59:08 +0100

  }
 -chunk_array_ptr = (uint64_t *)(unsigned long)(cs-chunks);
 +#ifdef CONFIG_COMPAT
 +if (is_compat_task())
 
 Are the COMPAT ifdefs really needed? The compiler should optimize that
 away anyways on non compat aware architectures, shouldn't it?

There are no non-compat is_compat_task() definitions, nor are there
non-compat build definitions of compat_uptr_t and the assosciated
interfaces.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-28 Thread David Miller
From: Andi Kleen a...@firstfloor.org
Date: Wed, 28 Oct 2009 09:19:09 +0100

 On Wed, Oct 28, 2009 at 01:11:41AM -0700, David Miller wrote:
 From: Andi Kleen a...@firstfloor.org
 Date: Wed, 28 Oct 2009 08:59:08 +0100
 
}
  - chunk_array_ptr = (uint64_t *)(unsigned long)(cs-chunks);
  +#ifdef CONFIG_COMPAT
  + if (is_compat_task())
  
  Are the COMPAT ifdefs really needed? The compiler should optimize that
  away anyways on non compat aware architectures, shouldn't it?
 
 There are no non-compat is_compat_task() definitions, nor are there
 non-compat build definitions of compat_uptr_t and the assosciated
 interfaces.
 
 That seems wrong then, better fix that too? It would be certainly better
 than adding a lot of ifdefs.

That's usually done by seperating the compat code into a seperate
file and obj-$(CONFIG_COMPAT) += foo.o in the Makefile.

That's not really possible here.

Sure, longer term we can provide those kinds of definitions to avoid
the ifdefs, but that's not what my change is about. :-)

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-28 Thread David Miller
From: Arnd Bergmann arndbergm...@googlemail.com
Date: Wed, 28 Oct 2009 13:13:32 +0100

 IMHO a better way to handle the radeon specific ioctls would be to
 avoid the compat_alloc_user_space and just define the common
 function taking a kernel pointer, with two implementations of the
 copy operation:

Agreed, that would be a great follow-on patch to mine.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-28 Thread David Miller
From: Arnd Bergmann arndbergm...@googlemail.com
Date: Wed, 28 Oct 2009 13:13:32 +0100

 The ioctl argument actually needs a compat_ptr() conversion as well.
 For the s390 case, we can't do that in common code, because some
 ioctl methods put a 32 bit integer into the argument. Not sure if we
 want to fix that everywhere, the problem is very common and the
 impact is minimal.

What does s390 do with the 'arg' argument to sys_ioctl()?

That assumption that you can cast this to a pointer is everywhere.

If someone wants to fix this up, feel free to do an audit and go
over that seperately from my work :-)

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-28 Thread David Miller
From: Arnd Bergmann arndbergm...@googlemail.com
Date: Wed, 28 Oct 2009 16:40:18 +0100

 I'm pretty sure it was ok when we started adding the compat_ioctl
 handlers years ago. I think most people just ignored these for
 the majority of drivers that can't possibly run on s390. Even
 on s390, gcc will always do the right thing if you call call ioctl
 with a pointer to a normal object in the .data section, heap or stack,
 but hand-written assembly or other compilers may not.

Arnd, even compat_sys_ioctl() itself has constructs like:

case FS_IOC_RESVSP:
case FS_IOC_RESVSP64:
error = ioctl_preallocate(filp, (void __user *)arg);
goto out_fput;

That's why I asked about the 'arg' argument to sys_ioctl
on s390 :-)

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-27 Thread David Miller
From: Dave Airlie airl...@gmail.com
Date: Wed, 28 Oct 2009 11:22:18 +1000

 Is there really no way to avoid compat ioctls? was I delusional in
 thinking there was?

If you use pointers in your interfaces in any way, no.

And for this drm_radeon_info thing the pointer is pointless,
you're just returning 32-bit values to the user, just use
a u32 inside of the drm_radeon_info structure for the kernel
to place the result in.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-27 Thread David Miller
From: Andi Kleen a...@firstfloor.org
Date: Wed, 28 Oct 2009 03:53:17 +0100

 Dave Airlie airl...@gmail.com writes:
 
 Now I thought cool I don't need to worry about compat ioctl hackery I can
 run 32 on 64 bit apps fine and it'll all just work.

 Now Dave Miller points out that I'm obivously deluded and we really need
 to add compat ioctls so that the kernel can truncate correctly 32-bit address
 in case userspace shoves garbage into the top 32bits of the u64.
 
 When the user space sees a u64 field it should never shove garbage here.

The problem is that it can if it wants to.

On sparc64, in order to make debugging easier, we trap any time
the kernel does a userspace access to a compat task and any
of the upper 32-bits are non-zero.

That's more than a reasonable assumption to make because user
pointers are either:

1) Zero chopped when passed in via register arguments to a syscall.

2) Go through the compat_uptr() interface which chops the upper bits.

So whether userland should or should not do something, we still have
to guard against it.

 You just have to cast on 32bit for this, which is a bit ugly.

Right, via the compat_*() interfaces.

 However some architectures need special operations on compat pointers
 (s390 iirc), but if you don't support those it might be reasonable
 to not support that.

s390 has to sign extend all 32-bit compat process pointers when
processing them in the 64-bit s390 kernel.  I think one other
architecture has this kind of situation too.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-27 Thread David Miller
From: Dave Airlie airl...@gmail.com
Date: Wed, 28 Oct 2009 13:05:08 +1000

 DrNick on irc suggested just doing:
 if (is_compat_task()) ptr = 0x;
 
 Is there a one liner I can just do in the actual ioctls instead of
 adding 20 compat
 ones?

Just do the right thing and pass all userland compat pointers
through the correct compat_*() macros.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-27 Thread David Miller
From: Dave Airlie airl...@gmail.com
Date: Wed, 28 Oct 2009 13:28:10 +1000

 On Wed, Oct 28, 2009 at 1:19 PM, Andi Kleen a...@firstfloor.org wrote:
 On Wed, Oct 28, 2009 at 01:05:08PM +1000, Dave Airlie wrote:
 We've designed that into a/c also, we pad all 64-bit values to 64-bit
 alignment on all the
 ioctls we've added to the drm in the past couple of years. Just because of
 this particular insanity.

 That's actually not needed, just use compat_*64.

 Assume no mistakes are made, new ioctls designed from scratch

 That seems like a bad assumption. It sounds like you already
 made some.
 
 You mean its impossible to design a 32/64-bit safe ioctl no matter what?

If you use pointers at all, yes.  We've said this several times.

 and we should live with having compat ioctls? This isn't something that was 
 very
 clearly stated or documented, the advice we had previously was that
 compat ioctls
 were only required for old ioctls or ioctls with design problems. compat_*64
 didn't exist when this code we designed, and we worked around that, but it was
 in no way mistaken, manually aligning 64-bit values is a perfectly
 good solution,
 not sure why you imply it isn't.

Manually aligning the way you have solves only one side of the
compat problem on x86 with 64-bit types.  It may handle the layout
properly bit it doesn't change the alignment requirements of the
containing structure and that's a similarly important the issue.

If you want to solve both aspects, use the aligned_u64 type.

But once you introduce pointers, you must have compat layer code,
and this is a hard requirement.  And there is nothing baroque or
wrong about it.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-27 Thread David Miller
From: Andi Kleen a...@firstfloor.org
Date: Wed, 28 Oct 2009 04:34:55 +0100

 On Wed, Oct 28, 2009 at 01:28:10PM +1000, Dave Airlie wrote:
 Well this was what I was trying to gather, so maybe I just need to write
 something up to state that compat_ioctl is always required for new ioctls
 that pass pointers or 64-bit values hiding pointers, so more people
 don't make this mistake going forward. I can say when we inquired about this
 2 or so years ago when designing kms I didn't get this answer, which is a 
 pity.
 
 Right now you could probably ignore it (if you document it), since
 there are no non s390 architectures with this problem, just
 prepare mentally that you might need to revisit this at some point.

You can't ignore it on sparc64, it already OOPS's, and I refuse to
live with that if (is_compat_task()) masking hack, no way.

We designed portable interfaces for doing this stuff, please use it.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-27 Thread David Miller
From: Dave Airlie airl...@linux.ie
Date: Wed, 28 Oct 2009 03:43:07 + (GMT)

 we already opencoded this (probably before it was macroisied or we just 
 pasted it), so the radeon one is buggy, I should just go and compat_* all 
 of these then and we should be all happy?

It should be, it's only working because:

1) A malicious userland hasn't put garbage in the upper bits for
   you yet.

2) Nobody has tested s390 yet :-)

Let's just not design non-portability into the code when we
have no strong reason to, ok? :-)

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-27 Thread David Miller
From: Dave Airlie airl...@linux.ie
Date: Wed, 28 Oct 2009 03:54:34 + (GMT)

 
  we already opencoded this (probably before it was macroisied or we just 
  pasted it), so the radeon one is buggy, I should just go and compat_* all 
  of these then and we should be all happy?
 
 It should be, it's only working because:
 
 1) A malicious userland hasn't put garbage in the upper bits for
you yet.
 
 2) Nobody has tested s390 yet :-)
 
 
 So will an inline like this work?
 
 static inline void *__user convert_user_ptr(uint64_t ioctl_ptr)
 {

Please don't do this.

This is exactly what I feared people would do when is_compat_task()
was added.  is_compat_task() is for situations where there is
otherwise no other way to handle the compat situation properly.

It's not that much work for you to hook up the compat ioctls properly,
and if you are clever you can do it in such a way that you'll get
warnings if someone accidently adds a new ioctl but forgets the
compat bits :-)

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: is avoiding compat ioctls possible?

2009-10-27 Thread David Miller
From: Andi Kleen a...@firstfloor.org
Date: Wed, 28 Oct 2009 05:36:11 +0100

 On Tue, Oct 27, 2009 at 08:37:09PM -0700, David Miller wrote:
 On sparc64, in order to make debugging easier, we trap any time
 the kernel does a userspace access to a compat task and any
 of the upper 32-bits are non-zero.
 
 Interesting. That definitely means Dave needs a special path.

I wouldn't call it special, the rule is that any userland pointer
has to either go through a syscall argument register or one
of the compat_*() accessor routines :-)

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.

2009-02-20 Thread David Miller
From: Arnd Bergmann a...@arndb.de
Date: Thu, 19 Feb 2009 15:19:01 +0100

 On Wednesday 18 February 2009, David Miller wrote:
  drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.
  
  Only X86 32-bit uses a different alignment for unsigned long long
  than it's 64-bit counterpart.
  
  Therefore this compat translation is only correct, and only needed,
  when either CONFIG_X86 or CONFIG_IA64.
  
  Signed-off-by: David S. Miller da...@davemloft.net
 
 The patch is correct AFAICT, but I'd like to point out that the
 problem could have been avoided (besides using a non-padded layout)
 by using a compat_u64 member in the struct definition instead of
 the packed attribute:

Indeed, David A. showed me compat_u64 et al. and I'm fine with
it being fixed that way too.  Feel free to submit a patch :)

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] drm: Preserve SHMLBA bits in hash key for _DRM_SHM mappings.

2009-02-20 Thread David Miller
From: Andrew Morton a...@linux-foundation.org
Date: Thu, 19 Feb 2009 15:27:26 -0800

 eg:
 
 arch/xtensa/include/asm/shmparam.h
 #define SHMLBA  ((PAGE_SIZE  DCACHE_WAY_SIZE)? PAGE_SIZE : DCACHE_WAY_SIZE)
 
 
 But including linux/shm.h here seems a bit silly.   We'll see..

If DRM even builds on XTENSA, let alone is usable there, I'll
buy you a lollipop.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


drm: radeon: Fix unaligned access in r300_scratch().

2009-02-18 Thread David Miller

drm: radeon: Fix unaligned access in r300_scratch().

In compat mode, the cmdbuf-buf 64-bit address cookie can
potentially be only 32-bit aligned.  Dereferencing this as
64-bit causes expensive unaligned traps on platforms like
sparc64.

Use get_unaligned() to fix.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/radeon/r300_cmdbuf.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r300_cmdbuf.c 
b/drivers/gpu/drm/radeon/r300_cmdbuf.c
index cace396..3efa633 100644
--- a/drivers/gpu/drm/radeon/r300_cmdbuf.c
+++ b/drivers/gpu/drm/radeon/r300_cmdbuf.c
@@ -37,6 +37,8 @@
 #include radeon_drv.h
 #include r300_reg.h
 
+#include asm/unaligned.h
+
 #define R300_SIMULTANEOUS_CLIPRECTS4
 
 /* Values for R300_RE_CLIPRECT_CNTL depending on the number of cliprects
@@ -917,6 +919,7 @@ static int r300_scratch(drm_radeon_private_t *dev_priv,
 {
u32 *ref_age_base;
u32 i, buf_idx, h_pending;
+   u64 ptr_addr;
RING_LOCALS;
 
if (cmdbuf-bufsz 
@@ -930,7 +933,8 @@ static int r300_scratch(drm_radeon_private_t *dev_priv,
 
dev_priv-scratch_ages[header.scratch.reg]++;
 
-   ref_age_base =  (u32 *)(unsigned long)*((uint64_t *)cmdbuf-buf);
+   ptr_addr = get_unaligned((u64 *)cmdbuf-buf);
+   ref_age_base = (u32 *)(unsigned long)ptr_addr;
 
cmdbuf-buf += sizeof(u64);
cmdbuf-bufsz -= sizeof(u64);
-- 
1.6.1.2.350.g88cc


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH] drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.

2009-02-18 Thread David Miller

Ben, I'm pretty sure you're hitting this too on powerpc.  Every time a
32-bit process tries to upload cliprects it's going to fail with
-EFAULT or similar.

Nothing in userspace checks the return value for errors, etc. :-/

The only reason I caught this is because I have a debugging check on
sparc64 that makes sure that faults on kernel accesses to 32-bit
userspace never have any of the high 32-bits set.



drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.

Only X86 32-bit uses a different alignment for unsigned long long
than it's 64-bit counterpart.

Therefore this compat translation is only correct, and only needed,
when either CONFIG_X86 or CONFIG_IA64.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/drm_ioc32.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/drm_ioc32.c b/drivers/gpu/drm/drm_ioc32.c
index 920b72f..282d9fd 100644
--- a/drivers/gpu/drm/drm_ioc32.c
+++ b/drivers/gpu/drm/drm_ioc32.c
@@ -954,6 +954,7 @@ static int compat_drm_sg_free(struct file *file, unsigned 
int cmd,
 DRM_IOCTL_SG_FREE, (unsigned long)request);
 }
 
+#if defined(CONFIG_X86) || defined(CONFIG_IA64)
 typedef struct drm_update_draw32 {
drm_drawable_t handle;
unsigned int type;
@@ -984,6 +985,7 @@ static int compat_drm_update_draw(struct file *file, 
unsigned int cmd,
DRM_IOCTL_UPDATE_DRAW, (unsigned long)request);
return err;
 }
+#endif
 
 struct drm_wait_vblank_request32 {
enum drm_vblank_seq_type type;
@@ -1066,7 +1068,9 @@ drm_ioctl_compat_t *drm_compat_ioctls[] = {
 #endif
[DRM_IOCTL_NR(DRM_IOCTL_SG_ALLOC32)] = compat_drm_sg_alloc,
[DRM_IOCTL_NR(DRM_IOCTL_SG_FREE32)] = compat_drm_sg_free,
+#if defined(CONFIG_X86) || defined(CONFIG_IA64)
[DRM_IOCTL_NR(DRM_IOCTL_UPDATE_DRAW32)] = compat_drm_update_draw,
+#endif
[DRM_IOCTL_NR(DRM_IOCTL_WAIT_VBLANK32)] = compat_drm_wait_vblank,
 };
 
-- 
1.6.1.2.350.g88cc


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.

2009-02-18 Thread David Miller
From: Benjamin Herrenschmidt b...@kernel.crashing.org
Date: Thu, 19 Feb 2009 08:59:50 +1100

 Could that be related to the kernel spewing a bunch of
 
 [drm:drm_update_drawable_info] *ERROR* Failed to copy cliprects from userspace

Yep, that is exactly caused by this bug.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH] drm: Preserve SHMLBA bits in hash key for _DRM_SHM mappings.

2009-02-18 Thread David Miller


Ok, this is the last DRM bug I am aware of with Radeon on sparc64.
What kills me is that I fixed this bug 6 or 7 years ago :)



drm: Preserve SHMLBA bits in hash key for _DRM_SHM mappings.

Platforms such as sparc64 have D-cache aliasing issues.  We
cannot allow virtual mappings in different contexts to be such
that two cache lines can be loaded for the same backing data.
Updates to one cache line won't be seen by accesses to the other
cache line.

Code in sparc64 and other architectures solve this problem by
making sure that all userland mappings of MAP_SHARED objects have
the same virtual address base.  They implement this by keying
off of the page offset, and using that to choose a suitably
consistent virtual address for mmap() requests.

Making things even worse, getting this wrong on sparc64 can result
in hangs during DRM lock acquisition.  This is because, at least on
UltraSPARC-III, normal loads consult the D-cache but atomics such
as 'cas' (which is what cmpxchg() is implement using) only consult
the L2 cache.  So if a D-cache alias is inserted, the load can
see different data than the atomic, and we'll loop forever because
the atomic compare-and-exchange will never complete successfully.

So to make this all work properly, we need to make sure that the
hash address computed by drm_map_handle() preserves the SHMLBA
relevant bits, and that's what this patch does for _DRM_SHM mappings.

As a historical note, many years ago this bug didn't exist because we
used to just use the low 32-bits of the address as the hash and just
hope for the best.  This preserved the SHMLBA bits properly.  But when
the hashtab code was added to DRM, this was no longer the case.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/drm_bufs.c |   35 +++
 1 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
index 069544c..445c762 100644
--- a/drivers/gpu/drm/drm_bufs.c
+++ b/drivers/gpu/drm/drm_bufs.c
@@ -34,6 +34,8 @@
  */
 
 #include linux/vmalloc.h
+#include linux/log2.h
+#include asm/shmparam.h
 #include drmP.h
 
 resource_size_t drm_get_resource_start(struct drm_device *dev, unsigned int 
resource)
@@ -83,9 +85,11 @@ static struct drm_map_list *drm_find_matching_map(struct 
drm_device *dev,
 }
 
 static int drm_map_handle(struct drm_device *dev, struct drm_hash_item *hash,
- unsigned long user_token, int hashed_handle)
+ unsigned long user_token, int hashed_handle, int shm)
 {
-   int use_hashed_handle;
+   int use_hashed_handle, shift;
+   unsigned long add;
+
 #if (BITS_PER_LONG == 64)
use_hashed_handle = ((user_token  0xUL) || 
hashed_handle);
 #elif (BITS_PER_LONG == 32)
@@ -101,9 +105,31 @@ static int drm_map_handle(struct drm_device *dev, struct 
drm_hash_item *hash,
if (ret != -EINVAL)
return ret;
}
+
+   shift = 0;
+   add = DRM_MAP_HASH_OFFSET  PAGE_SHIFT;
+   if (shm  (SHMLBA  PAGE_SIZE)) {
+   int bits = ilog2(SHMLBA  PAGE_SHIFT) + 1;
+
+   /* For shared memory, we have to preserve the SHMLBA
+* bits of the eventual vma-vm_pgoff value during
+* mmap().  Otherwise we run into cache aliasing problems
+* on some platforms.  On these platforms, the pgoff of
+* a mmap() request is used to pick a suitable virtual
+* address for the mmap() region such that it will not
+* cause cache aliasing problems.
+*
+* Therefore, make sure the SHMLBA relevant bits of the
+* hash value we use are equal to those in the original
+* kernel virtual address.
+*/
+   shift = bits;
+   add |= ((user_token  PAGE_SHIFT)  ((1UL  bits) - 1UL));
+   }
+
return drm_ht_just_insert_please(dev-map_hash, hash,
 user_token, 32 - PAGE_SHIFT - 3,
-0, DRM_MAP_HASH_OFFSET  PAGE_SHIFT);
+shift, add);
 }
 
 /**
@@ -323,7 +349,8 @@ static int drm_addmap_core(struct drm_device * dev, 
resource_size_t offset,
/* We do it here so that dev-struct_mutex protects the increment */
user_token = (map-type == _DRM_SHM) ? (unsigned long)map-handle :
map-offset;
-   ret = drm_map_handle(dev, list-hash, user_token, 0);
+   ret = drm_map_handle(dev, list-hash, user_token, 0,
+(map-type == _DRM_SHM));
if (ret) {
if (map-type == _DRM_REGISTERS)
iounmap(map-handle);
-- 
1.6.1.2.350.g88cc


--
Open Source Business Conference (OSBC), March 24-25, 2009, 

Re: [dri-devel] BROKEN drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init().

2009-02-15 Thread David Miller
From: Sedat Dilek sedat.di...@googlemail.com
Date: Sun, 15 Feb 2009 10:28:29 +0100

 unfortunately, your latest DRM patch [1] is broken while building
 against Linux-2.6.25-rc5.
 
 This is my patch-series:

Did you read my patch postings?  I said my work was relative
to dri-next, as sparc64 also needs the changes David has
commited there from Ben H.  It therefore uses types only
found in that branch.

In David's dri-next branch, the proper type is struct drm_local_map,
so if you're not patching against his dri-next branch you won't have
Ben H's drm_addmap() changes which create and use that new type.

You will notice that when David added my changes to his dri-fixes
branch, he corrected up that single line which you could have very
easily done as well.  Just change struct drm_local_map to just plain
struct drm_map.


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-14 Thread David Miller
From: Dave Airlie airl...@gmail.com
Date: Sat, 14 Feb 2009 17:42:02 +1000

 On Sat, Feb 14, 2009 at 4:09 PM, David Miller da...@davemloft.net wrote:
  1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead
of the constant 4096 to determine how many GART entries there
are.  The PCI GART entries map 4096 bytes, always.  So using
getpagesize() is wrong.  (see RADEONDRIGetPciAperTableSize)
 
I have this fixed in my local tree.
 
 Oops.
 
  2) It doesn't check the surface byte swapping settings, so it
could be saving in one byte order and restoing in another.
 
I guess we could force RADEON_SURFACE_CNTL to zero around
the two memcpy()'s done in radeon_driver.c
 
 Might be a good plan.

I have patches for both of these things written, will submit
to the xorg-driver-ati list.

I also have a cunning plan to work around the surface swapping
GART issue in the DRM, will try that out right now.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-14 Thread David Miller
From: Benjamin Herrenschmidt b...@kernel.crashing.org
Date: Sat, 14 Feb 2009 20:07:54 +1100

  I did some research, and it does appear that the GART does read the
  PTEs from the VRAM using the Host Data Path.  This means the surface
  control byte swapping settings are applied.
  
  So for depths of 16 and 24, the GART is reading garbage PTEs.  And
  that's why the CP hangs.
 
 That makes me wonder how the heck did it work for me ! Or maybe... I've
 been using an R5xx which happens to have a bit that I haven't seen on
 R3xx that allows ... to set whether the GART reads come from HDP or
 directly from MC. That might be what saved my ass here.

I wonder.  But I really doubt it.  The bit is off by default
and the radeon DRM code explicitly sets it to off.

 We can do that by registering a surface from the kernel to cover the
 GART I suppose, and clean things a bit so that when using the DRI, X
 doesn't touch the surface registers -at all- and leaves it to the
 kernel.

That actually sounds like a good idea.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH]: drm: radeon: Use surface for PCI GART table.

2009-02-14 Thread David Miller

This allocates a physical surface for the PCI GART table, this way no
matter what other surface configurations exist the GART table will
always be seen by the hardware properly.

We encode the file pointer of the virtual surface allocate using a
special cookie value, called PCIGART_FILE_PRIV.  On the last close, we
release that surface.

Just to be doubly safe, we run the pcigart table setup with the main
surface control register clear.

Based upon ideas from David Airlie and Ben Benjamin Herrenschmidt.

Signed-off-by: David S. Miller da...@davemloft.net

diff --git a/drivers/gpu/drm/radeon/radeon_cp.c 
b/drivers/gpu/drm/radeon/radeon_cp.c
index e235778..119e642 100644
--- a/drivers/gpu/drm/radeon/radeon_cp.c
+++ b/drivers/gpu/drm/radeon/radeon_cp.c
@@ -909,6 +909,46 @@ static void radeon_set_pcigart(drm_radeon_private_t * 
dev_priv, int on)
}
 }
 
+static int radeon_setup_pcigart_surface(drm_radeon_private_t *dev_priv)
+{
+   struct drm_ati_pcigart_info *gart_info = dev_priv-gart_info;
+   struct radeon_virt_surface *vp;
+   int i;
+
+   for (i = 0; i  RADEON_MAX_SURFACES * 2; i++) {
+   if (!dev_priv-virt_surfaces[i].file_priv ||
+   dev_priv-virt_surfaces[i].file_priv == PCIGART_FILE_PRIV)
+   break;
+   }
+   if (i = 2 * RADEON_MAX_SURFACES)
+   return -ENOMEM;
+   vp = dev_priv-virt_surfaces[i];
+
+   for (i = 0; i  RADEON_MAX_SURFACES; i++) {
+   struct radeon_surface *sp = dev_priv-surfaces[i];
+   if (sp-refcount)
+   continue;
+
+   vp-surface_index = i;
+   vp-lower = gart_info-bus_addr;
+   vp-upper = vp-lower + gart_info-table_size;
+   vp-flags = 0;
+   vp-file_priv = PCIGART_FILE_PRIV;
+
+   sp-refcount = 1;
+   sp-lower = vp-lower;
+   sp-upper = vp-upper;
+   sp-flags = 0;
+
+   RADEON_WRITE(RADEON_SURFACE0_INFO + 16 * i, sp-flags);
+   RADEON_WRITE(RADEON_SURFACE0_LOWER_BOUND + 16 * i, sp-lower);
+   RADEON_WRITE(RADEON_SURFACE0_UPPER_BOUND + 16 * i, sp-upper);
+   return 0;
+   }
+
+   return -ENOMEM;
+}
+
 static int radeon_do_init_cp(struct drm_device *dev, drm_radeon_init_t *init,
 struct drm_file *file_priv)
 {
@@ -1202,6 +1242,9 @@ static int radeon_do_init_cp(struct drm_device *dev, 
drm_radeon_init_t *init,
} else
 #endif
{
+   u32 sctrl;
+   int ret;
+
dev_priv-gart_info.table_mask = DMA_BIT_MASK(32);
/* if we have an offset set from userspace */
if (dev_priv-pcigart_offset_set) {
@@ -1243,12 +1286,25 @@ static int radeon_do_init_cp(struct drm_device *dev, 
drm_radeon_init_t *init,
}
}
 
-   if (!drm_ati_pcigart_init(dev, dev_priv-gart_info)) {
+   sctrl = RADEON_READ(RADEON_SURFACE_CNTL);
+   RADEON_WRITE(RADEON_SURFACE_CNTL, 0);
+   ret = drm_ati_pcigart_init(dev, dev_priv-gart_info);
+   RADEON_WRITE(RADEON_SURFACE_CNTL, sctrl);
+
+   if (!ret) {
DRM_ERROR(failed to init PCI GART!\n);
radeon_do_cleanup_cp(dev);
return -ENOMEM;
}
 
+   ret = radeon_setup_pcigart_surface(dev_priv);
+   if (ret) {
+   DRM_ERROR(failed to setup GART surface!\n);
+   drm_ati_pcigart_cleanup(dev, dev_priv-gart_info);
+   radeon_do_cleanup_cp(dev);
+   return ret;
+   }
+
/* Turn on PCI GART */
radeon_set_pcigart(dev_priv, 1);
}
diff --git a/drivers/gpu/drm/radeon/radeon_drv.h 
b/drivers/gpu/drm/radeon/radeon_drv.h
index 9b60a26..ecfd414 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.h
+++ b/drivers/gpu/drm/radeon/radeon_drv.h
@@ -217,6 +217,7 @@ struct radeon_virt_surface {
u32 upper;
u32 flags;
struct drm_file *file_priv;
+#define PCIGART_FILE_PRIV  ((void *) -1L)
 };
 
 #define RADEON_FLUSH_EMITED(1  0)
diff --git a/drivers/gpu/drm/radeon/radeon_state.c 
b/drivers/gpu/drm/radeon/radeon_state.c
index 03fea43..043293a 100644
--- a/drivers/gpu/drm/radeon/radeon_state.c
+++ b/drivers/gpu/drm/radeon/radeon_state.c
@@ -3155,6 +3155,7 @@ void radeon_driver_preclose(struct drm_device *dev, 
struct drm_file *file_priv)
 
 void radeon_driver_lastclose(struct drm_device *dev)
 {
+   radeon_surfaces_release(PCIGART_FILE_PRIV, dev-dev_private);
radeon_do_release(dev);
 }
 

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the 

Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-14 Thread David Miller
From: David Miller da...@davemloft.net
Date: Sat, 14 Feb 2009 01:11:45 -0800 (PST)

 From: Benjamin Herrenschmidt b...@kernel.crashing.org
 Date: Sat, 14 Feb 2009 20:07:54 +1100
 
  We can do that by registering a surface from the kernel to cover the
  GART I suppose, and clean things a bit so that when using the DRI, X
  doesn't touch the surface registers -at all- and leaves it to the
  kernel.
 
 That actually sounds like a good idea.

Ok I have it working, patch coming right up.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH]: drm: radeon: Use surface for PCI GART table.

2009-02-14 Thread David Miller
From: Michel Dänzer mic...@daenzer.net
Date: Sat, 14 Feb 2009 10:59:59 +0100

 On Sat, 2009-02-14 at 01:51 -0800, David Miller wrote:
  This allocates a physical surface for the PCI GART table, this way no
  matter what other surface configurations exist the GART table will
  always be seen by the hardware properly.
 
 BTW, I don't think the swapping settings affect GPU access to the table,
 only CPU access, but this is a good solution anyway.

They absolutely and positively do effect GPU access to the table.
I've proven it with many conclusive tests over the past 3 days.

 This is only really necessary on big endian, but other than that:
 
 Acked-By: Michel Dänzer daen...@vmware.com

Thanks.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-13 Thread David Miller
From: Benjamin Herrenschmidt b...@kernel.crashing.org
Date: Thu, 12 Feb 2009 21:35:59 +1100

 Oh BTW something else to be careful with, though I suppose it's working
 some what by accident right now... when the GART is in the frame buffer
 it gets applied the current fb swapper setting... ouch !
 
 So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
 afaik are readl/writel (ie, swapping) to make sure we at least
 temporarily disable that swapper while whacking the GART.

So I've narrowed down the final problems I'm having.  It's the surface
swapping settings indeed.

Running Xorg at depth 8, the CP can execute indirect buffers just
fine, no code changes necessary.

However, the CP hangs on indirect execution for depth 16 and 24.  But
these depths work if I hard disable the surface byte swapping settings
in the radeon Xorg driver.

I did some research, and it does appear that the GART does read the
PTEs from the VRAM using the Host Data Path.  This means the surface
control byte swapping settings are applied.

So for depths of 16 and 24, the GART is reading garbage PTEs.  And
that's why the CP hangs.

I have no idea how to handle this properly.  Not only does the Xorg
ATI driver set the swapping settings at an arbitray point, it also
mucks with them dynamically f.e. in the render helper functions.

The only way this can work properly is to choose one surface swapping
setting, set the VRAM GART table so that the GART sees the PTEs in the
correct format, and retain those settings throughout.

It seems like, in at least R5xx, they tried to add a control bit in
the PCI-E GART registers to handle this.  Bit 6 in PCI-E indirect
register 0x10 is named 'GART_RDREQPATH_SEL' and has description:

Read request path
 0=HDP
 1=Direct Rd Req to MC

This seems to be intended to be a way to have the GART PTE reads
bypass the full Host Data Path (and thus potential surface byte
swaps), by setting this bit to 1.

I tried doing that, but it doesn't help my RV370 so perhaps older
chips don't support this bit.  It's hard to tell as this is the area
where all of AMD's published GPU documents are severely lacking.

I tested whether reading back the PCIE_TX_GART_CNTL register shows
bit 6 after I try to write it, and it always reads back zero.

There are even more complications, the VT enter/exit code in the Xorg
ATI driver tries to save and restore the VRAM GART table for PCI-E
cards.  But this:

1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead
   of the constant 4096 to determine how many GART entries there
   are.  The PCI GART entries map 4096 bytes, always.  So using
   getpagesize() is wrong.  (see RADEONDRIGetPciAperTableSize)

   I have this fixed in my local tree.

2) It doesn't check the surface byte swapping settings, so it
   could be saving in one byte order and restoing in another.

   I guess we could force RADEON_SURFACE_CNTL to zero around
   the two memcpy()'s done in radeon_driver.c

But really this whole area is a mess, and I know KMS is coming to fix
this, but even in the traditional XORG/DRM layout XORG has no business
keeping the GART table uptodate across VT switches.  It should be
calling into the DRM with an ioctl to write the GART table out to VRAM
again.

Finally there is a huge issue with how the Xorg ATI driver resets the
chip using the RBBM.  It soft resets the CP, but this resets the ring
read pointer.  However, nothing is done to make sure the DRM resync's
the ring write pointer (which remains unchanged by a soft CP reset) as
well as it's internal software ring state.

The result is that on the very next CP command submission, the CP
re-executes everything from ring entry zero until wherever the DRM
things the write pointer was at the time of the CP soft reset.

Any time the Xorg ATI driver does a CP soft reset, it should do
CP stop and resume calls to the DRM to resync the ring pointers.
And this does not appear to be happening where it needs to be
happening.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH]: drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init().

2009-02-13 Thread David Miller

drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init().

The variable 'max_pages' is ambiguous.  There are two concepts
of pages being used in this function.

First, we have ATI GART pages which are always 4096 bytes.
Then, we have system pages which are of size PAGE_SIZE.

Eliminate the confusion by creating max_ati_pages and
max_real_pages.  Calculate and use them as appropriate.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/ati_pcigart.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c
index 7972ec8..4d86a62 100644
--- a/drivers/gpu/drm/ati_pcigart.c
+++ b/drivers/gpu/drm/ati_pcigart.c
@@ -102,7 +102,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct 
drm_ati_pcigart_info *ga
u32 *pci_gart, page_base, gart_idx;
dma_addr_t bus_address = 0;
int i, j, ret = 0;
-   int max_pages;
+   int max_ati_pages, max_real_pages;
 
if (!entry) {
DRM_ERROR(no scatter/gather memory!\n);
@@ -130,14 +130,15 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct 
drm_ati_pcigart_info *ga
 
pci_gart = (u32 *) address;
 
-   max_pages = (gart_info-table_size / sizeof(u32));
-   pages = (entry-pages = max_pages)
-   ? entry-pages : max_pages;
+   max_ati_pages = (gart_info-table_size / sizeof(u32));
+   max_real_pages = max_ati_pages / (PAGE_SIZE / ATI_PCIGART_PAGE_SIZE);
+   pages = (entry-pages = max_real_pages)
+   ? entry-pages : max_real_pages;
 
if (gart_info-gart_table_location == DRM_ATI_GART_MAIN) {
-   memset(pci_gart, 0, max_pages * sizeof(u32));
+   memset(pci_gart, 0, max_ati_pages * sizeof(u32));
} else {
-   for (gart_idx = 0; gart_idx  max_pages; gart_idx++)
+   for (gart_idx = 0; gart_idx  max_ati_pages; gart_idx++)
DRM_WRITE32(map, gart_idx * sizeof(u32), 0);
}
 
-- 
1.6.1.2.350.g88cc


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-12 Thread David Miller

The PCI GART table initialization code treats the GART table mapping
unconditionally as a kernel virtual address.

But it could be in the framebuffer, for example, and thus we're
dealing with a PCI MEM space ioremap() cookie.  Treating that as a
virtual address is illegal and will crash some system types (such as
sparc64 where the ioremap() return value is actually a physical I/O
address).

So access the area correctly, using gart_info-gart_table_location as
our guide.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/ati_pcigart.c |   26 --
 1 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c
index c533d0c..2cd827a 100644
--- a/drivers/gpu/drm/ati_pcigart.c
+++ b/drivers/gpu/drm/ati_pcigart.c
@@ -95,10 +95,11 @@ EXPORT_SYMBOL(drm_ati_pcigart_cleanup);
 
 int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info 
*gart_info)
 {
+   struct drm_local_map *map = gart_info-mapping;
struct drm_sg_mem *entry = dev-sg;
void *address = NULL;
unsigned long pages;
-   u32 *pci_gart, page_base;
+   u32 *pci_gart, page_base, gart_idx;
dma_addr_t bus_address = 0;
int i, j, ret = 0;
int max_pages;
@@ -133,8 +134,14 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct 
drm_ati_pcigart_info *ga
pages = (entry-pages = max_pages)
? entry-pages : max_pages;
 
-   memset(pci_gart, 0, max_pages * sizeof(u32));
+   if (gart_info-gart_table_location == DRM_ATI_GART_MAIN) {
+   memset(pci_gart, 0, max_pages * sizeof(u32));
+   } else {
+   for (gart_idx = 0; gart_idx  max_pages; gart_idx++)
+   DRM_WRITE32(map, gart_idx * sizeof(u32), 0);
+   }
 
+   gart_idx = 0;
for (i = 0; i  pages; i++) {
/* we need to support large memory configurations */
entry-busaddr[i] = pci_map_page(dev-pdev, entry-pagelist[i],
@@ -149,19 +156,26 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct 
drm_ati_pcigart_info *ga
page_base = (u32) entry-busaddr[i];
 
for (j = 0; j  (PAGE_SIZE / ATI_PCIGART_PAGE_SIZE); j++) {
+   u32 val;
+
switch(gart_info-gart_reg_if) {
case DRM_ATI_GART_IGP:
-   *pci_gart = cpu_to_le32((page_base) | 0xc);
+   val = page_base | 0xc;
break;
case DRM_ATI_GART_PCIE:
-   *pci_gart = cpu_to_le32((page_base  8) | 0xc);
+   val = (page_base  8) | 0xc;
break;
default:
case DRM_ATI_GART_PCI:
-   *pci_gart = cpu_to_le32(page_base);
+   val = page_base;
break;
}
-   pci_gart++;
+   if (gart_info-gart_table_location ==
+   DRM_ATI_GART_MAIN)
+   pci_gart[gart_idx] = cpu_to_le32(val);
+   else
+   DRM_WRITE32(map, gart_idx * sizeof(u32), val);
+   gart_idx++;
page_base += ATI_PCIGART_PAGE_SIZE;
}
}
-- 
1.6.1.2.350.g88cc


--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH 2/5]: drm: ati_pcigart: Need to use PCI_DMA_BIDIRECTIONAL.

2009-02-12 Thread David Miller

The buffers mapped by the PCI GART can be written to by the device,
not just read.

For example, this happens via the RB_RPTR writeback on Radeon.

So we can't use PCI_DMA_TODEVICE else we'll get protection faults
on IOMMU platforms.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/ati_pcigart.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c
index 2cd827a..7972ec8 100644
--- a/drivers/gpu/drm/ati_pcigart.c
+++ b/drivers/gpu/drm/ati_pcigart.c
@@ -77,7 +77,7 @@ int drm_ati_pcigart_cleanup(struct drm_device *dev, struct 
drm_ati_pcigart_info
if (!entry-busaddr[i])
break;
pci_unmap_page(dev-pdev, entry-busaddr[i],
-PAGE_SIZE, PCI_DMA_TODEVICE);
+PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
}
 
if (gart_info-gart_table_location == DRM_ATI_GART_MAIN)
@@ -145,7 +145,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct 
drm_ati_pcigart_info *ga
for (i = 0; i  pages; i++) {
/* we need to support large memory configurations */
entry-busaddr[i] = pci_map_page(dev-pdev, entry-pagelist[i],
-0, PAGE_SIZE, 
PCI_DMA_TODEVICE);
+0, PAGE_SIZE, 
PCI_DMA_BIDIRECTIONAL);
if (entry-busaddr[i] == 0) {
DRM_ERROR(unable to map PCIGART pages!\n);
drm_ati_pcigart_cleanup(dev, gart_info);
-- 
1.6.1.2.350.g88cc


--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH 3/5]: drm: radeon: Fix ring_rptr accesses.

2009-02-12 Thread David Miller

The memory behind ring_rptr can either be in ioremapped memory
or a vmalloc() normal kernel memory buffer.

However, the code unconditionally uses DRM_{READ,WRITE}32() (and thus
readl() and writel()) to access it.

Basically, if RADEON_IS_AGP then it's ioremap()'d memory else it's
vmalloc'd memory.

Adjust all of the ring_rptr access code as needed.

While we're here, kill the 'scratch' pointer in drm_radeon_private.
It's only used in the one place where it is initialized.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/radeon/radeon_cp.c|   70 +++--
 drivers/gpu/drm/radeon/radeon_drv.h   |   17 
 drivers/gpu/drm/radeon/radeon_state.c |6 +-
 3 files changed, 70 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cp.c 
b/drivers/gpu/drm/radeon/radeon_cp.c
index ae4ffa1..9053755 100644
--- a/drivers/gpu/drm/radeon/radeon_cp.c
+++ b/drivers/gpu/drm/radeon/radeon_cp.c
@@ -43,6 +43,52 @@
 static int radeon_do_cleanup_cp(struct drm_device * dev);
 static void radeon_do_cp_start(drm_radeon_private_t * dev_priv);
 
+static u32 radeon_read_ring_rptr(drm_radeon_private_t *dev_priv, u32 off)
+{
+   u32 val;
+
+   if (dev_priv-flags  RADEON_IS_AGP) {
+   val = DRM_READ32(dev_priv-ring_rptr, off);
+   } else {
+   val = *(((volatile u32 *)
+dev_priv-ring_rptr-handle) +
+   (off / sizeof(u32)));
+   val = le32_to_cpu(val);
+   }
+   return val;
+}
+
+u32 radeon_get_ring_head(drm_radeon_private_t *dev_priv)
+{
+   if (dev_priv-writeback_works)
+   return radeon_read_ring_rptr(dev_priv, 0);
+   else
+   return RADEON_READ(RADEON_CP_RB_RPTR);
+}
+
+static void radeon_write_ring_rptr(drm_radeon_private_t *dev_priv, u32 off, 
u32 val)
+{
+   if (dev_priv-flags  RADEON_IS_AGP)
+   DRM_WRITE32(dev_priv-ring_rptr, off, val);
+   else
+   *(((volatile u32 *) dev_priv-ring_rptr-handle) +
+ (off / sizeof(u32))) = cpu_to_le32(val);
+}
+
+void radeon_set_ring_head(drm_radeon_private_t *dev_priv, u32 val)
+{
+   radeon_write_ring_rptr(dev_priv, 0, val);
+}
+
+u32 radeon_get_scratch(drm_radeon_private_t *dev_priv, int index)
+{
+   if (dev_priv-writeback_works)
+   return radeon_read_ring_rptr(dev_priv,
+RADEON_SCRATCHOFF(index));
+   else
+   return RADEON_READ(RADEON_SCRATCH_REG0 + 4*index);
+}
+
 static u32 R500_READ_MCIND(drm_radeon_private_t *dev_priv, int addr)
 {
u32 ret;
@@ -647,10 +693,6 @@ static void radeon_cp_init_ring_buffer(struct drm_device * 
dev,
RADEON_WRITE(RADEON_SCRATCH_ADDR, RADEON_READ(RADEON_CP_RB_RPTR_ADDR)
 + RADEON_SCRATCH_REG_OFFSET);
 
-   dev_priv-scratch = ((__volatile__ u32 *)
-dev_priv-ring_rptr-handle +
-(RADEON_SCRATCH_REG_OFFSET / sizeof(u32)));
-
RADEON_WRITE(RADEON_SCRATCH_UMSK, 0x7);
 
/* Turn on bus mastering */
@@ -668,13 +710,13 @@ static void radeon_cp_init_ring_buffer(struct drm_device 
* dev,
RADEON_WRITE(RADEON_BUS_CNTL, tmp);
} /* PCIE cards appears to not need this */
 
-   dev_priv-scratch[0] = 0;
+   radeon_write_ring_rptr(dev_priv, RADEON_SCRATCHOFF(0), 0);
RADEON_WRITE(RADEON_LAST_FRAME_REG, 0);
 
-   dev_priv-scratch[1] = 0;
+   radeon_write_ring_rptr(dev_priv, RADEON_SCRATCHOFF(1), 0);
RADEON_WRITE(RADEON_LAST_DISPATCH_REG, 0);
 
-   dev_priv-scratch[2] = 0;
+   radeon_write_ring_rptr(dev_priv, RADEON_SCRATCHOFF(2), 0);
RADEON_WRITE(RADEON_LAST_CLEAR_REG, 0);
 
radeon_do_wait_for_idle(dev_priv);
@@ -698,12 +740,15 @@ static void radeon_test_writeback(drm_radeon_private_t * 
dev_priv)
/* Writeback doesn't seem to work everywhere, test it here and possibly
 * enable it if it appears to work
 */
-   DRM_WRITE32(dev_priv-ring_rptr, RADEON_SCRATCHOFF(1), 0);
+   radeon_write_ring_rptr(dev_priv, RADEON_SCRATCHOFF(1), 0);
+
RADEON_WRITE(RADEON_SCRATCH_REG1, 0xdeadbeef);
 
for (tmp = 0; tmp  dev_priv-usec_timeout; tmp++) {
-   if (DRM_READ32(dev_priv-ring_rptr, RADEON_SCRATCHOFF(1)) ==
-   0xdeadbeef)
+   u32 val;
+
+   val = radeon_read_ring_rptr(dev_priv, RADEON_SCRATCHOFF(1));
+   if (val == 0xdeadbeef)
break;
DRM_UDELAY(1);
}
@@ -1540,7 +1585,7 @@ struct drm_buf *radeon_freelist_get(struct drm_device * 
dev)
start = dev_priv-last_buf;
 
for (t = 0; t  dev_priv-usec_timeout; t++) {
-   u32 done_age = GET_SCRATCH(1);
+   u32 done_age = GET_SCRATCH(dev_priv, 1);
DRM_DEBUG(done_age = %d\n, done_age);
for (i = start; i  

[PATCH 5/5]: drm: radeon: Fix calculation of RB_RPTR_ADDR in non-AGP case.

2009-02-12 Thread David Miller

The address needs to be a GART relative address, rather than a PCI
DMA address.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/radeon/radeon_cp.c |   15 ---
 1 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cp.c 
b/drivers/gpu/drm/radeon/radeon_cp.c
index 9053755..e235778 100644
--- a/drivers/gpu/drm/radeon/radeon_cp.c
+++ b/drivers/gpu/drm/radeon/radeon_cp.c
@@ -655,17 +655,10 @@ static void radeon_cp_init_ring_buffer(struct drm_device 
* dev,
} else
 #endif
{
-   struct drm_sg_mem *entry = dev-sg;
-   unsigned long tmp_ofs, page_ofs;
-
-   tmp_ofs = dev_priv-ring_rptr-offset -
-   (unsigned long)dev-sg-virtual;
-   page_ofs = tmp_ofs  PAGE_SHIFT;
-
-   RADEON_WRITE(RADEON_CP_RB_RPTR_ADDR, entry-busaddr[page_ofs]);
-   DRM_DEBUG(ring rptr: offset=0x%08lx handle=0x%08lx\n,
- (unsigned long)entry-busaddr[page_ofs],
- entry-handle + tmp_ofs);
+   RADEON_WRITE(RADEON_CP_RB_RPTR_ADDR,
+dev_priv-ring_rptr-offset
+- ((unsigned long) dev-sg-virtual)
++ dev_priv-gart_vm_start);
}
 
/* Set ring buffer size */
-- 
1.6.1.2.350.g88cc


--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH 4/5]: drm: radeon: Fix RADEON_*_EMITED defines.

2009-02-12 Thread David Miller

These are not supposed to be booleans, they are
supposed to be bit masks.

Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/gpu/drm/radeon/radeon_drv.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.h 
b/drivers/gpu/drm/radeon/radeon_drv.h
index a253cf0..9b60a26 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.h
+++ b/drivers/gpu/drm/radeon/radeon_drv.h
@@ -219,8 +219,8 @@ struct radeon_virt_surface {
struct drm_file *file_priv;
 };
 
-#define RADEON_FLUSH_EMITED(1  0)
-#define RADEON_PURGE_EMITED(1  1)
+#define RADEON_FLUSH_EMITED(1  0)
+#define RADEON_PURGE_EMITED(1  1)
 
 struct drm_radeon_master_private {
drm_local_map_t *sarea;
-- 
1.6.1.2.350.g88cc


--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-12 Thread David Miller
From: Benjamin Herrenschmidt b...@kernel.crashing.org
Date: Thu, 12 Feb 2009 21:35:59 +1100

 Oh BTW something else to be careful with, though I suppose it's working
 some what by accident right now... when the GART is in the frame buffer
 it gets applied the current fb swapper setting... ouch !
 
 So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
 afaik are readl/writel (ie, swapping) to make sure we at least
 temporarily disable that swapper while whacking the GART.

Cute.

I wonder if this is what is tripping me up.

But, looking more closely, it appears that:

1) The kernel radeon framebuffer driver doesn't mess with
   the framebuffer endianness setting.

2) On = R300 (which my chip is), Xorg leaves it alone too.

--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-12 Thread David Miller
From: Dave Airlie airl...@gmail.com
Date: Thu, 12 Feb 2009 21:23:13 +1000

 are you on a PCI or PCIE card, I've no idea what buses you have on sparc64.
 
 On the PCI cards the GART table will always be in main memory.
 PCIE always in VRAM.

PCI-E

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH 0/5]: ATI/RADEON DRM bug fixes...

2009-02-12 Thread David Miller
From: Dave Airlie airl...@gmail.com
Date: Thu, 12 Feb 2009 21:26:51 +1000

 On Thu, Feb 12, 2009 at 8:15 PM, David Miller da...@davemloft.net wrote:
 
  David, this work is against your drm-next branch.
 
  Here are a collection of bug fixes for the Radeon DRM support.  Most
  of them have to do with trying to access kernel virtual addresses
  using DRM_READ32() and DRM_WRITE32().
 
  With these patches at least the writeback test works on sparc64 and
  the CP is able to process commands in the ring.  I'm now diagnosing
  some further problem that's preventing Xorg from functioning fully but
  I should be able to diagnose that soon.
 
  You'll probably love patch #4 in this series, and it probably explains
  all kinds of weird problems people run into with DRM on radeon cards.
 
 Thanks Dave,
 
 These all look great, the EMITED one has potential to regress
 something by fixing it,
 but I think it might actually make things better, I'll push all of
 these to Linus asap as
 they all fix real bugs. I've put them into my drm-fixes branch and I'll try 
 and
 re-review and get some testing tomorrow when I'm more awake.

Note that these depend upon Ben's drm_addmap and resource fixed in
your drm-next branch, sparc64 was also broken by those problems.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: drm + 4GB RAM + swiotlb = drm craps out

2007-04-01 Thread David Miller
From: Dave Airlie [EMAIL PROTECTED]
Date: Mon, 2 Apr 2007 09:44:41 +1000

 Okay I've got a bug reported before and now again about  4GB + radeon
 blows up the DRM... on Intel hw...
 
 What the drm currently does for the PCI GART table is it allocates a
 chunk of memory (8MB) with vmalloc_32(), then when it decides to use
 it it goes through every page of it calls pci_map_single() (with
 PCI_DMA_TODEVICE, which is probably wrong...) with every page from the
 vmalloc mapping and puts the bus addresses of the pages into the PCI
 GART table on the GPU.
 
 So when swiotlb happens, as you can guess it all falls apart as the
 drm never calls sync functions at any stage...

You would have hit this on any platform that does caching
in the PCI controller as well.

 The main problem is the ring buffer and scratch write back, these
 values are read/write from both the CPU and GPU quite a lot, so this
 leads me to think I should really just be using dma_alloc_coherent for
 the whole lot, however this is an 8MB mapping and possibly could be
 getting larger in the future and dynamic as we do dynamic PCIEGART
 support for the radeons...
 
 So I suppose I'm asking for ideas on the correct way to do this, and
 perhaps any quick way to patch up the problem I'm seeing now by making
 swiotlb not get involved 

Coherent memory was created for precisely the case where the cpu
and the device frequently access the memory.

8MB is indeed a lot for the kind of allocation that the coherent
DMA implementation uses.

Does it really have to be all in one big 8MB chunk?  I doubt it.
Perhaps you can therefore create multiple DMA pools instead?  See
include/linux/dmapool.h

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: drm + 4GB RAM + swiotlb = drm craps out

2007-04-01 Thread David Miller
From: Dave Airlie [EMAIL PROTECTED]
Date: Mon, 2 Apr 2007 14:08:13 +1000

  
   So when swiotlb happens, as you can guess it all falls apart as the
   drm never calls sync functions at any stage...
 
  You would have hit this on any platform that does caching
  in the PCI controller as well.
 
 We must not have a great intersect of radeon and such systems..

It might explain why my machine hung when I tried to use
radeon with DRM on my sparc64 workstation :-)  I have
investigating that on my todo list.

 It currently is required to be in a big 8MB chunk as it gets chopped
 up by the X server not the kernel, so kernel needs to allocate pages
 to back it when X inits, yes this is ugly, no it can't be fixed
 without time-travelling and fixing deployed X servers...

 Really we probably only need the ring buffer to be in coherent memory,
 the rest of the stuff is used for DMA buffers which are mainly filled
 by the CPU and read by the GPU. However I cannot change this without
 breaking X, the solution is really to use TTM for this sort of
 stuff I'm a bit worried as the AGP driver now uses vmalloc_32
 which really is a meaningless interface on 64-bit systems..

I don't know what to recommend to you, getting 8MB of linear memory
really just isn't practical.

Perhaps we'll have to create something ugly like vmalloc_nobounce().

Remind me again why you're ending up with swiotlb'd pages?
vmalloc_32() uses GFP_KERNEL which should use entirely lowmem and thus
RAM below 4GB and not anything which should need bounce buffering.

You should only get swiotlb'd pages if __GFP_HIGHMEM were set in
the gfp flags.

Are you expecting to be able to virtually remap these pages in
PCI space as one huge 8MB chunk too and that's how swiotlb gets
involved?  That won't work, sorry...

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: drm + 4GB RAM + swiotlb = drm craps out

2007-04-01 Thread David Miller
From: Dave Airlie [EMAIL PROTECTED]
Date: Mon, 2 Apr 2007 15:15:48 +1000

  Perhaps we'll have to create something ugly like vmalloc_nobounce().
 
  Remind me again why you're ending up with swiotlb'd pages?
  vmalloc_32() uses GFP_KERNEL which should use entirely lowmem and thus
  RAM below 4GB and not anything which should need bounce buffering.
 
 On a 64-bit machine GFP_KERNEL can give me any memory... it all works
 fine on 32-bit highmem kernel as I don't get highmem... I really need
 __GFP_DMA32 memory but we don't have a generic allocator that gives
 this out that I can see..

That clears things up thanks.

Perhaps the other uses of vmalloc_32() want GFP_32 semantics too,
although I didn't check.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel