Re: [git pull] drm request 3
From: Alan Cox a...@lxorguk.ukuu.org.uk Date: Fri, 5 Mar 2010 12:38:34 + The conclusion is crystal clear, breaking an ABI via a flag day cleanup/feature/etc is: Ingo go read the staging Kconfig. It's crystal clear, and lots of vendor junk that is in there being cleaned up it would be *insane* to keep their old APIs See there's a bigger offence than breaking an ABI - its called not RTFM. All of this RTFM and what directory the noveau driver is sitting in is entirely irrelevant Alan. If it effects such a large number of people, which this noveau thing does, it's entirely relevant to everyone. And the way it's breaking and making kernel development difficult for so many people matters to us. It's about the tester base, and this breakage shrinks the tester base considerably. Or do you want the kernel tested by less people? -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [git pull] drm request 3
From: Alan Cox a...@lxorguk.ukuu.org.uk Date: Fri, 5 Mar 2010 15:09:34 + I think you miss a bigger picture ? If Fedora hadn't merged it then it wouldn't have gotten to the state of usability it had. If Fedora hadn't merged it then several hundred thousand users wouldn't have had useful working machines. I think Fedora were right to merge it, and I think it was proper to merge it into the upstream kernel. But I also think the large size of that userbase should have been respected by doing the right thing here, and that means not making it so hard for Fedora users to use upstream kernels out of the box. Making the sandbox claim cuts both ways Alan. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [git pull] drm request 3
From: Daniel Stone dan...@fooishbar.org Date: Fri, 5 Mar 2010 17:17:54 +0200 On Fri, Mar 05, 2010 at 06:37:18AM -0800, David Miller wrote: If it effects such a large number of people, which this noveau thing does, it's entirely relevant to everyone. And the way it's breaking and making kernel development difficult for so many people matters to us. Maybe the lesson to be learned from all this is, 'if the developers don't want something merged because they're not ready and forsee huge problems in the future, actually listen to them instead of blindly ramming it in regardless'? But maybe that's just me. That's doesn't work, and it never will. First of all, if we didn't merge the driver Fedora users wouldn't be able to test the upstream kernel at all. And if you think things through, there is one and onle one set of actions that would have made things work properly. And that's merge the driver upstream and not break user visible APIs. In fact, I argue that the moment nouveau went into Fedora and was turned on by default, the interfaces needed to be frozen. Consider if it didn't go upstream and I want to do upstream kernel development, ok so I patch the noveau-of-the-moment into my upstream tree. Six months and 10 DRM library updates later I go back and try to boot that kernel. And it's not going to work. So if the user visible APIs are changed in any set of situations (upstream merged, not upstream merged, etc.) things can end up breaking. Ergo, you simply can't sanely do it at all. You have to have a compatability story when you change these things. Personally I wouldn't have ever committed to that user visible APIs can break cause it's in -stable. Because that's complete garbage. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [git pull] drm request 3
From: Daniel Stone dan...@fooishbar.org Date: Fri, 5 Mar 2010 17:40:09 +0200 On Fri, Mar 05, 2010 at 07:26:12AM -0800, David Miller wrote: In fact, I argue that the moment nouveau went into Fedora and was turned on by default, the interfaces needed to be frozen. That's a matter for the Fedora kernel team; for better or worse, they made the choices they did, which included going through all the relevant pain to support this. They didn't consider it suitable for upstream because they didn't think everyone else should be forced to endure that pain. By not merging it upstream the pain is larger not smaller. It's enabled by default, so you therefore can't test upstream kernels by default. And as I showed already, even if you jump through the hoops to make it work (building noveau from out of tree in the upstream kernel) you'll end up getting screwed when the API changes anyways. Using VESA or whatever else you've suggested is just not a reasonable alternative. You can't unleash something like this on a userbase of this magnitude and then throw your hands up in the air and say I'm not willing to support this in a reasonable way. We're better than that. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Making Xorg easier to test
From: Daniel Stone dan...@fooishbar.org Date: Fri, 5 Mar 2010 17:41:43 +0200 I understand that you guys are upset about this, so maybe you'd like to donate, say, 10% of your developer base to help out? That'd be pretty ace. You have to support less than %10 of the amount of hardware we have to support. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [git pull] drm request 3
From: Alan Cox a...@lxorguk.ukuu.org.uk Date: Fri, 5 Mar 2010 16:02:17 + You can't unleash something like this on a userbase of this magnitude and then throw your hands up in the air and say I'm not willing to support this in a reasonable way. Not to belabour the obvious - they didn't. Linus ordered them to merge it. And I'm arguing not merging it upstream would be like saying the same thing. We're better than that. If you consider the problem is with the Fedora kernel team decision to merge it into Fedora in the first place For the second time Alan, I don't. I think Fedora should have merged it. I think it should be upstream. And I think the API bustage needs to be avoided. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [git pull] drm request 3
From: Daniel Stone dan...@fooishbar.org Date: Fri, 5 Mar 2010 18:04:34 +0200 So you're saying that there's no way to develop any reasonable body of code for the Linux kernel without committing to keeping your ABI absolutely rock-solid stable for eternity, no exceptions, ever? Cool, that worked really well for Xlib. read() still works the same way it did 30 years ago last time I checked. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Making Xorg easier to test
From: Xavier Bestel xavier.bes...@free.fr Date: Fri, 05 Mar 2010 18:50:24 +0100 On Fri, 2010-03-05 at 07:49 -0800, David Miller wrote: From: Daniel Stone dan...@fooishbar.org Date: Fri, 5 Mar 2010 17:41:43 +0200 I understand that you guys are upset about this, so maybe you'd like to donate, say, 10% of your developer base to help out? That'd be pretty ace. You have to support less than %10 of the amount of hardware we have to support. You can't compare a network card and a GPU. The latter is way more complex to code for. I wasn't talking specifically about network cards. But if you want specific examples... How about the x86 or parisc cpu, and all their derivatives, are those complex enough for you? :-) I've worked on OpenGL capable grapics card drivers of various vintages, and I honestly don't think there is anything in there more complex than what we have to deal with in the kernel. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Making Xorg easier to test
From: Jesse Barnes jbar...@virtuousgeek.org Date: Fri, 5 Mar 2010 10:02:44 -0800 So from that perspective, the graphics stack is the most complex one in Linux by a long shot. It's even worse than if we had STREAMS networking with a ton of different modules up in userspace messing with protocol. :) Maybe :-) -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [git pull] drm
From: Alan Cox a...@lxorguk.ukuu.org.uk Date: Fri, 11 Dec 2009 09:18:43 + However the fundamental point stands. The only people who can sign it off are the people who wrote it. Those are the rules. Red Hat didn't write the code, Red Hat cannot sign it off however much you rant at them. You also previously said you don't want to merge stuff when the authors don't want it merged. I agree with a lot of what you say. However, one point remains is that we were told, by Dave Airlie, that they didn't want this code merged because the one person being paid to work on it would be overwhelmed if the code went upstream. I distinctly remember this being mentioned at the kernel summit. And you know what? That kind of excuse pisses me off too :-) -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Heiko Carstens heiko.carst...@de.ibm.com Date: Thu, 29 Oct 2009 09:34:16 +0100 Subject: [PATCH] fs: add missing compat_ptr handling for FS_IOC_RESVSP ioctl From: Heiko Carstens heiko.carst...@de.ibm.com For FS_IOC_RESVSP and FS_IOC_RESVSP64 compat_sys_ioctl() uses its arg argument as a pointer to userspace. However it is missing a a call to compat_ptr() which will do a proper pointer conversion. This was introduced with 3e63cbb1 fs: Add new pre-allocation ioctls to vfs for compatibility with legacy xfs ioctls. Cc: Ankit Jain m...@ankitjain.org Cc: Christoph Hellwig h...@lst.de Cc: Al Viro v...@zeniv.linux.org.uk Cc: Arnd Bergmann arndbergm...@googlemail.com Reported-by: David Miller da...@davemloft.net Signed-off-by: Heiko Carstens heiko.carst...@de.ibm.com Acked-by: David S. Miller da...@davemloft.net -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Dave Airlie airl...@linux.ie Date: Wed, 28 Oct 2009 05:42:27 + (GMT) I'll add this to my TODO for before the next merge window as its definitely more than I can manage now. I'll do it. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: David Miller da...@davemloft.net Date: Tue, 27 Oct 2009 23:04:50 -0700 (PDT) From: Dave Airlie airl...@linux.ie Date: Wed, 28 Oct 2009 05:42:27 + (GMT) I'll add this to my TODO for before the next merge window as its definitely more than I can manage now. I'll do it. Ok, everything was straightforward except for radeon_cs, which has three levels of indirection for the tables of data and relocation chunks userland can pass into the DRM :-/ There is no way to isolate the compat handling without parsing and bringing the pointer array and the chunk headers into the kernel twice. So I guess I'm willing to capitulate with is_compat_task() checks in radeon_cs_init(), but if you want to improve this interface I see two ways: 1) Eliminate one level of indirection so there is just a straight scatter-gather list at one level. 2) Use a base pointer and a set of offsets to communicate at least one level of indirection. Both schemes should satisfy the buffering flexibility these interfaces seem to be geared towards giving to userland. Anyways the following patch is tested and seems to work well. Signed-off-by: David S. Miller da...@davemloft.net diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 5ab2cf9..ba44eba 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -24,6 +24,8 @@ * Authors: *Jerome Glisse gli...@freedesktop.org */ +#include linux/compat.h + #include drmP.h #include radeon_drm.h #include radeon_reg.h @@ -107,7 +109,12 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data) if (p-chunks_array == NULL) { return -ENOMEM; } - chunk_array_ptr = (uint64_t *)(unsigned long)(cs-chunks); +#ifdef CONFIG_COMPAT + if (is_compat_task()) + chunk_array_ptr = compat_ptr((compat_uptr_t) cs-chunks); + else +#endif + chunk_array_ptr = (uint64_t *)(unsigned long)(cs-chunks); if (DRM_COPY_FROM_USER(p-chunks_array, chunk_array_ptr, sizeof(uint64_t)*cs-num_chunks)) { return -EFAULT; @@ -120,9 +127,15 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data) for (i = 0; i p-nchunks; i++) { struct drm_radeon_cs_chunk __user **chunk_ptr = NULL; struct drm_radeon_cs_chunk user_chunk; - uint32_t __user *cdata; - chunk_ptr = (void __user*)(unsigned long)p-chunks_array[i]; +#ifdef CONFIG_COMPAT + if (is_compat_task()) + chunk_ptr = compat_ptr((compat_uptr_t) + p-chunks_array[i]); + else +#endif + chunk_ptr = (void __user*) (unsigned long) + p-chunks_array[i]; if (DRM_COPY_FROM_USER(user_chunk, chunk_ptr, sizeof(struct drm_radeon_cs_chunk))) { return -EFAULT; @@ -142,9 +155,16 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data) } p-chunks[i].length_dw = user_chunk.length_dw; - p-chunks[i].user_ptr = (void __user *)(unsigned long)user_chunk.chunk_data; +#ifdef CONFIG_COMPAT + if (is_compat_task()) + p-chunks[i].user_ptr = + compat_ptr((compat_uptr_t) + user_chunk.chunk_data); + else +#endif + p-chunks[i].user_ptr = (void __user *) (unsigned long) + user_chunk.chunk_data; - cdata = (uint32_t *)(unsigned long)user_chunk.chunk_data; if (p-chunks[i].chunk_id != RADEON_CHUNK_ID_IB) { size = p-chunks[i].length_dw * sizeof(uint32_t); p-chunks[i].kdata = kmalloc(size, GFP_KERNEL); diff --git a/drivers/gpu/drm/radeon/radeon_ioc32.c b/drivers/gpu/drm/radeon/radeon_ioc32.c index a1bf11d..3c4dfa2 100644 --- a/drivers/gpu/drm/radeon/radeon_ioc32.c +++ b/drivers/gpu/drm/radeon/radeon_ioc32.c @@ -156,7 +156,7 @@ static int compat_radeon_cp_stipple(struct file *file, unsigned int cmd, typedef struct drm_radeon_tex_image32 { unsigned int x, y; /* Blit coordinates */ unsigned int width, height; - u32 data; + compat_uptr_t data; } drm_radeon_tex_image32_t; typedef struct drm_radeon_texture32 { @@ -165,7 +165,7 @@ typedef struct drm_radeon_texture32 { int format; int width; /* Texture image coordinates */ int height; - u32 image; + compat_uptr_t image; } drm_radeon_texture32_t; static int compat_radeon_cp_texture(struct file *file, unsigned int cmd, @@ -180,8 +180,7 @@ static int compat_radeon_cp_texture(struct file *file, unsigned int cmd, return -EFAULT
Re: is avoiding compat ioctls possible?
From: Andi Kleen a...@firstfloor.org Date: Wed, 28 Oct 2009 08:59:08 +0100 } -chunk_array_ptr = (uint64_t *)(unsigned long)(cs-chunks); +#ifdef CONFIG_COMPAT +if (is_compat_task()) Are the COMPAT ifdefs really needed? The compiler should optimize that away anyways on non compat aware architectures, shouldn't it? There are no non-compat is_compat_task() definitions, nor are there non-compat build definitions of compat_uptr_t and the assosciated interfaces. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Andi Kleen a...@firstfloor.org Date: Wed, 28 Oct 2009 09:19:09 +0100 On Wed, Oct 28, 2009 at 01:11:41AM -0700, David Miller wrote: From: Andi Kleen a...@firstfloor.org Date: Wed, 28 Oct 2009 08:59:08 +0100 } - chunk_array_ptr = (uint64_t *)(unsigned long)(cs-chunks); +#ifdef CONFIG_COMPAT + if (is_compat_task()) Are the COMPAT ifdefs really needed? The compiler should optimize that away anyways on non compat aware architectures, shouldn't it? There are no non-compat is_compat_task() definitions, nor are there non-compat build definitions of compat_uptr_t and the assosciated interfaces. That seems wrong then, better fix that too? It would be certainly better than adding a lot of ifdefs. That's usually done by seperating the compat code into a seperate file and obj-$(CONFIG_COMPAT) += foo.o in the Makefile. That's not really possible here. Sure, longer term we can provide those kinds of definitions to avoid the ifdefs, but that's not what my change is about. :-) -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Arnd Bergmann arndbergm...@googlemail.com Date: Wed, 28 Oct 2009 13:13:32 +0100 IMHO a better way to handle the radeon specific ioctls would be to avoid the compat_alloc_user_space and just define the common function taking a kernel pointer, with two implementations of the copy operation: Agreed, that would be a great follow-on patch to mine. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Arnd Bergmann arndbergm...@googlemail.com Date: Wed, 28 Oct 2009 13:13:32 +0100 The ioctl argument actually needs a compat_ptr() conversion as well. For the s390 case, we can't do that in common code, because some ioctl methods put a 32 bit integer into the argument. Not sure if we want to fix that everywhere, the problem is very common and the impact is minimal. What does s390 do with the 'arg' argument to sys_ioctl()? That assumption that you can cast this to a pointer is everywhere. If someone wants to fix this up, feel free to do an audit and go over that seperately from my work :-) -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Arnd Bergmann arndbergm...@googlemail.com Date: Wed, 28 Oct 2009 16:40:18 +0100 I'm pretty sure it was ok when we started adding the compat_ioctl handlers years ago. I think most people just ignored these for the majority of drivers that can't possibly run on s390. Even on s390, gcc will always do the right thing if you call call ioctl with a pointer to a normal object in the .data section, heap or stack, but hand-written assembly or other compilers may not. Arnd, even compat_sys_ioctl() itself has constructs like: case FS_IOC_RESVSP: case FS_IOC_RESVSP64: error = ioctl_preallocate(filp, (void __user *)arg); goto out_fput; That's why I asked about the 'arg' argument to sys_ioctl on s390 :-) -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Dave Airlie airl...@gmail.com Date: Wed, 28 Oct 2009 11:22:18 +1000 Is there really no way to avoid compat ioctls? was I delusional in thinking there was? If you use pointers in your interfaces in any way, no. And for this drm_radeon_info thing the pointer is pointless, you're just returning 32-bit values to the user, just use a u32 inside of the drm_radeon_info structure for the kernel to place the result in. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Andi Kleen a...@firstfloor.org Date: Wed, 28 Oct 2009 03:53:17 +0100 Dave Airlie airl...@gmail.com writes: Now I thought cool I don't need to worry about compat ioctl hackery I can run 32 on 64 bit apps fine and it'll all just work. Now Dave Miller points out that I'm obivously deluded and we really need to add compat ioctls so that the kernel can truncate correctly 32-bit address in case userspace shoves garbage into the top 32bits of the u64. When the user space sees a u64 field it should never shove garbage here. The problem is that it can if it wants to. On sparc64, in order to make debugging easier, we trap any time the kernel does a userspace access to a compat task and any of the upper 32-bits are non-zero. That's more than a reasonable assumption to make because user pointers are either: 1) Zero chopped when passed in via register arguments to a syscall. 2) Go through the compat_uptr() interface which chops the upper bits. So whether userland should or should not do something, we still have to guard against it. You just have to cast on 32bit for this, which is a bit ugly. Right, via the compat_*() interfaces. However some architectures need special operations on compat pointers (s390 iirc), but if you don't support those it might be reasonable to not support that. s390 has to sign extend all 32-bit compat process pointers when processing them in the 64-bit s390 kernel. I think one other architecture has this kind of situation too. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Dave Airlie airl...@gmail.com Date: Wed, 28 Oct 2009 13:05:08 +1000 DrNick on irc suggested just doing: if (is_compat_task()) ptr = 0x; Is there a one liner I can just do in the actual ioctls instead of adding 20 compat ones? Just do the right thing and pass all userland compat pointers through the correct compat_*() macros. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Dave Airlie airl...@gmail.com Date: Wed, 28 Oct 2009 13:28:10 +1000 On Wed, Oct 28, 2009 at 1:19 PM, Andi Kleen a...@firstfloor.org wrote: On Wed, Oct 28, 2009 at 01:05:08PM +1000, Dave Airlie wrote: We've designed that into a/c also, we pad all 64-bit values to 64-bit alignment on all the ioctls we've added to the drm in the past couple of years. Just because of this particular insanity. That's actually not needed, just use compat_*64. Assume no mistakes are made, new ioctls designed from scratch That seems like a bad assumption. It sounds like you already made some. You mean its impossible to design a 32/64-bit safe ioctl no matter what? If you use pointers at all, yes. We've said this several times. and we should live with having compat ioctls? This isn't something that was very clearly stated or documented, the advice we had previously was that compat ioctls were only required for old ioctls or ioctls with design problems. compat_*64 didn't exist when this code we designed, and we worked around that, but it was in no way mistaken, manually aligning 64-bit values is a perfectly good solution, not sure why you imply it isn't. Manually aligning the way you have solves only one side of the compat problem on x86 with 64-bit types. It may handle the layout properly bit it doesn't change the alignment requirements of the containing structure and that's a similarly important the issue. If you want to solve both aspects, use the aligned_u64 type. But once you introduce pointers, you must have compat layer code, and this is a hard requirement. And there is nothing baroque or wrong about it. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Andi Kleen a...@firstfloor.org Date: Wed, 28 Oct 2009 04:34:55 +0100 On Wed, Oct 28, 2009 at 01:28:10PM +1000, Dave Airlie wrote: Well this was what I was trying to gather, so maybe I just need to write something up to state that compat_ioctl is always required for new ioctls that pass pointers or 64-bit values hiding pointers, so more people don't make this mistake going forward. I can say when we inquired about this 2 or so years ago when designing kms I didn't get this answer, which is a pity. Right now you could probably ignore it (if you document it), since there are no non s390 architectures with this problem, just prepare mentally that you might need to revisit this at some point. You can't ignore it on sparc64, it already OOPS's, and I refuse to live with that if (is_compat_task()) masking hack, no way. We designed portable interfaces for doing this stuff, please use it. -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Dave Airlie airl...@linux.ie Date: Wed, 28 Oct 2009 03:43:07 + (GMT) we already opencoded this (probably before it was macroisied or we just pasted it), so the radeon one is buggy, I should just go and compat_* all of these then and we should be all happy? It should be, it's only working because: 1) A malicious userland hasn't put garbage in the upper bits for you yet. 2) Nobody has tested s390 yet :-) Let's just not design non-portability into the code when we have no strong reason to, ok? :-) -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Dave Airlie airl...@linux.ie Date: Wed, 28 Oct 2009 03:54:34 + (GMT) we already opencoded this (probably before it was macroisied or we just pasted it), so the radeon one is buggy, I should just go and compat_* all of these then and we should be all happy? It should be, it's only working because: 1) A malicious userland hasn't put garbage in the upper bits for you yet. 2) Nobody has tested s390 yet :-) So will an inline like this work? static inline void *__user convert_user_ptr(uint64_t ioctl_ptr) { Please don't do this. This is exactly what I feared people would do when is_compat_task() was added. is_compat_task() is for situations where there is otherwise no other way to handle the compat situation properly. It's not that much work for you to hook up the compat ioctls properly, and if you are clever you can do it in such a way that you'll get warnings if someone accidently adds a new ioctl but forgets the compat bits :-) -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: is avoiding compat ioctls possible?
From: Andi Kleen a...@firstfloor.org Date: Wed, 28 Oct 2009 05:36:11 +0100 On Tue, Oct 27, 2009 at 08:37:09PM -0700, David Miller wrote: On sparc64, in order to make debugging easier, we trap any time the kernel does a userspace access to a compat task and any of the upper 32-bits are non-zero. Interesting. That definitely means Dave needs a special path. I wouldn't call it special, the rule is that any userland pointer has to either go through a syscall argument register or one of the compat_*() accessor routines :-) -- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.
From: Arnd Bergmann a...@arndb.de Date: Thu, 19 Feb 2009 15:19:01 +0100 On Wednesday 18 February 2009, David Miller wrote: drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86. Only X86 32-bit uses a different alignment for unsigned long long than it's 64-bit counterpart. Therefore this compat translation is only correct, and only needed, when either CONFIG_X86 or CONFIG_IA64. Signed-off-by: David S. Miller da...@davemloft.net The patch is correct AFAICT, but I'd like to point out that the problem could have been avoided (besides using a non-padded layout) by using a compat_u64 member in the struct definition instead of the packed attribute: Indeed, David A. showed me compat_u64 et al. and I'm fine with it being fixed that way too. Feel free to submit a patch :) -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Preserve SHMLBA bits in hash key for _DRM_SHM mappings.
From: Andrew Morton a...@linux-foundation.org Date: Thu, 19 Feb 2009 15:27:26 -0800 eg: arch/xtensa/include/asm/shmparam.h #define SHMLBA ((PAGE_SIZE DCACHE_WAY_SIZE)? PAGE_SIZE : DCACHE_WAY_SIZE) But including linux/shm.h here seems a bit silly. We'll see.. If DRM even builds on XTENSA, let alone is usable there, I'll buy you a lollipop. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
drm: radeon: Fix unaligned access in r300_scratch().
drm: radeon: Fix unaligned access in r300_scratch(). In compat mode, the cmdbuf-buf 64-bit address cookie can potentially be only 32-bit aligned. Dereferencing this as 64-bit causes expensive unaligned traps on platforms like sparc64. Use get_unaligned() to fix. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/radeon/r300_cmdbuf.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/r300_cmdbuf.c b/drivers/gpu/drm/radeon/r300_cmdbuf.c index cace396..3efa633 100644 --- a/drivers/gpu/drm/radeon/r300_cmdbuf.c +++ b/drivers/gpu/drm/radeon/r300_cmdbuf.c @@ -37,6 +37,8 @@ #include radeon_drv.h #include r300_reg.h +#include asm/unaligned.h + #define R300_SIMULTANEOUS_CLIPRECTS4 /* Values for R300_RE_CLIPRECT_CNTL depending on the number of cliprects @@ -917,6 +919,7 @@ static int r300_scratch(drm_radeon_private_t *dev_priv, { u32 *ref_age_base; u32 i, buf_idx, h_pending; + u64 ptr_addr; RING_LOCALS; if (cmdbuf-bufsz @@ -930,7 +933,8 @@ static int r300_scratch(drm_radeon_private_t *dev_priv, dev_priv-scratch_ages[header.scratch.reg]++; - ref_age_base = (u32 *)(unsigned long)*((uint64_t *)cmdbuf-buf); + ptr_addr = get_unaligned((u64 *)cmdbuf-buf); + ref_age_base = (u32 *)(unsigned long)ptr_addr; cmdbuf-buf += sizeof(u64); cmdbuf-bufsz -= sizeof(u64); -- 1.6.1.2.350.g88cc -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH] drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.
Ben, I'm pretty sure you're hitting this too on powerpc. Every time a 32-bit process tries to upload cliprects it's going to fail with -EFAULT or similar. Nothing in userspace checks the return value for errors, etc. :-/ The only reason I caught this is because I have a debugging check on sparc64 that makes sure that faults on kernel accesses to 32-bit userspace never have any of the high 32-bits set. drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86. Only X86 32-bit uses a different alignment for unsigned long long than it's 64-bit counterpart. Therefore this compat translation is only correct, and only needed, when either CONFIG_X86 or CONFIG_IA64. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/drm_ioc32.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/drm_ioc32.c b/drivers/gpu/drm/drm_ioc32.c index 920b72f..282d9fd 100644 --- a/drivers/gpu/drm/drm_ioc32.c +++ b/drivers/gpu/drm/drm_ioc32.c @@ -954,6 +954,7 @@ static int compat_drm_sg_free(struct file *file, unsigned int cmd, DRM_IOCTL_SG_FREE, (unsigned long)request); } +#if defined(CONFIG_X86) || defined(CONFIG_IA64) typedef struct drm_update_draw32 { drm_drawable_t handle; unsigned int type; @@ -984,6 +985,7 @@ static int compat_drm_update_draw(struct file *file, unsigned int cmd, DRM_IOCTL_UPDATE_DRAW, (unsigned long)request); return err; } +#endif struct drm_wait_vblank_request32 { enum drm_vblank_seq_type type; @@ -1066,7 +1068,9 @@ drm_ioctl_compat_t *drm_compat_ioctls[] = { #endif [DRM_IOCTL_NR(DRM_IOCTL_SG_ALLOC32)] = compat_drm_sg_alloc, [DRM_IOCTL_NR(DRM_IOCTL_SG_FREE32)] = compat_drm_sg_free, +#if defined(CONFIG_X86) || defined(CONFIG_IA64) [DRM_IOCTL_NR(DRM_IOCTL_UPDATE_DRAW32)] = compat_drm_update_draw, +#endif [DRM_IOCTL_NR(DRM_IOCTL_WAIT_VBLANK32)] = compat_drm_wait_vblank, }; -- 1.6.1.2.350.g88cc -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.
From: Benjamin Herrenschmidt b...@kernel.crashing.org Date: Thu, 19 Feb 2009 08:59:50 +1100 Could that be related to the kernel spewing a bunch of [drm:drm_update_drawable_info] *ERROR* Failed to copy cliprects from userspace Yep, that is exactly caused by this bug. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH] drm: Preserve SHMLBA bits in hash key for _DRM_SHM mappings.
Ok, this is the last DRM bug I am aware of with Radeon on sparc64. What kills me is that I fixed this bug 6 or 7 years ago :) drm: Preserve SHMLBA bits in hash key for _DRM_SHM mappings. Platforms such as sparc64 have D-cache aliasing issues. We cannot allow virtual mappings in different contexts to be such that two cache lines can be loaded for the same backing data. Updates to one cache line won't be seen by accesses to the other cache line. Code in sparc64 and other architectures solve this problem by making sure that all userland mappings of MAP_SHARED objects have the same virtual address base. They implement this by keying off of the page offset, and using that to choose a suitably consistent virtual address for mmap() requests. Making things even worse, getting this wrong on sparc64 can result in hangs during DRM lock acquisition. This is because, at least on UltraSPARC-III, normal loads consult the D-cache but atomics such as 'cas' (which is what cmpxchg() is implement using) only consult the L2 cache. So if a D-cache alias is inserted, the load can see different data than the atomic, and we'll loop forever because the atomic compare-and-exchange will never complete successfully. So to make this all work properly, we need to make sure that the hash address computed by drm_map_handle() preserves the SHMLBA relevant bits, and that's what this patch does for _DRM_SHM mappings. As a historical note, many years ago this bug didn't exist because we used to just use the low 32-bits of the address as the hash and just hope for the best. This preserved the SHMLBA bits properly. But when the hashtab code was added to DRM, this was no longer the case. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/drm_bufs.c | 35 +++ 1 files changed, 31 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c index 069544c..445c762 100644 --- a/drivers/gpu/drm/drm_bufs.c +++ b/drivers/gpu/drm/drm_bufs.c @@ -34,6 +34,8 @@ */ #include linux/vmalloc.h +#include linux/log2.h +#include asm/shmparam.h #include drmP.h resource_size_t drm_get_resource_start(struct drm_device *dev, unsigned int resource) @@ -83,9 +85,11 @@ static struct drm_map_list *drm_find_matching_map(struct drm_device *dev, } static int drm_map_handle(struct drm_device *dev, struct drm_hash_item *hash, - unsigned long user_token, int hashed_handle) + unsigned long user_token, int hashed_handle, int shm) { - int use_hashed_handle; + int use_hashed_handle, shift; + unsigned long add; + #if (BITS_PER_LONG == 64) use_hashed_handle = ((user_token 0xUL) || hashed_handle); #elif (BITS_PER_LONG == 32) @@ -101,9 +105,31 @@ static int drm_map_handle(struct drm_device *dev, struct drm_hash_item *hash, if (ret != -EINVAL) return ret; } + + shift = 0; + add = DRM_MAP_HASH_OFFSET PAGE_SHIFT; + if (shm (SHMLBA PAGE_SIZE)) { + int bits = ilog2(SHMLBA PAGE_SHIFT) + 1; + + /* For shared memory, we have to preserve the SHMLBA +* bits of the eventual vma-vm_pgoff value during +* mmap(). Otherwise we run into cache aliasing problems +* on some platforms. On these platforms, the pgoff of +* a mmap() request is used to pick a suitable virtual +* address for the mmap() region such that it will not +* cause cache aliasing problems. +* +* Therefore, make sure the SHMLBA relevant bits of the +* hash value we use are equal to those in the original +* kernel virtual address. +*/ + shift = bits; + add |= ((user_token PAGE_SHIFT) ((1UL bits) - 1UL)); + } + return drm_ht_just_insert_please(dev-map_hash, hash, user_token, 32 - PAGE_SHIFT - 3, -0, DRM_MAP_HASH_OFFSET PAGE_SHIFT); +shift, add); } /** @@ -323,7 +349,8 @@ static int drm_addmap_core(struct drm_device * dev, resource_size_t offset, /* We do it here so that dev-struct_mutex protects the increment */ user_token = (map-type == _DRM_SHM) ? (unsigned long)map-handle : map-offset; - ret = drm_map_handle(dev, list-hash, user_token, 0); + ret = drm_map_handle(dev, list-hash, user_token, 0, +(map-type == _DRM_SHM)); if (ret) { if (map-type == _DRM_REGISTERS) iounmap(map-handle); -- 1.6.1.2.350.g88cc -- Open Source Business Conference (OSBC), March 24-25, 2009,
Re: [dri-devel] BROKEN drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init().
From: Sedat Dilek sedat.di...@googlemail.com Date: Sun, 15 Feb 2009 10:28:29 +0100 unfortunately, your latest DRM patch [1] is broken while building against Linux-2.6.25-rc5. This is my patch-series: Did you read my patch postings? I said my work was relative to dri-next, as sparc64 also needs the changes David has commited there from Ben H. It therefore uses types only found in that branch. In David's dri-next branch, the proper type is struct drm_local_map, so if you're not patching against his dri-next branch you won't have Ben H's drm_addmap() changes which create and use that new type. You will notice that when David added my changes to his dri-fixes branch, he corrected up that single line which you could have very easily done as well. Just change struct drm_local_map to just plain struct drm_map. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
From: Dave Airlie airl...@gmail.com Date: Sat, 14 Feb 2009 17:42:02 +1000 On Sat, Feb 14, 2009 at 4:09 PM, David Miller da...@davemloft.net wrote: 1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead of the constant 4096 to determine how many GART entries there are. The PCI GART entries map 4096 bytes, always. So using getpagesize() is wrong. (see RADEONDRIGetPciAperTableSize) I have this fixed in my local tree. Oops. 2) It doesn't check the surface byte swapping settings, so it could be saving in one byte order and restoing in another. I guess we could force RADEON_SURFACE_CNTL to zero around the two memcpy()'s done in radeon_driver.c Might be a good plan. I have patches for both of these things written, will submit to the xorg-driver-ati list. I also have a cunning plan to work around the surface swapping GART issue in the DRM, will try that out right now. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
From: Benjamin Herrenschmidt b...@kernel.crashing.org Date: Sat, 14 Feb 2009 20:07:54 +1100 I did some research, and it does appear that the GART does read the PTEs from the VRAM using the Host Data Path. This means the surface control byte swapping settings are applied. So for depths of 16 and 24, the GART is reading garbage PTEs. And that's why the CP hangs. That makes me wonder how the heck did it work for me ! Or maybe... I've been using an R5xx which happens to have a bit that I haven't seen on R3xx that allows ... to set whether the GART reads come from HDP or directly from MC. That might be what saved my ass here. I wonder. But I really doubt it. The bit is off by default and the radeon DRM code explicitly sets it to off. We can do that by registering a surface from the kernel to cover the GART I suppose, and clean things a bit so that when using the DRI, X doesn't touch the surface registers -at all- and leaves it to the kernel. That actually sounds like a good idea. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH]: drm: radeon: Use surface for PCI GART table.
This allocates a physical surface for the PCI GART table, this way no matter what other surface configurations exist the GART table will always be seen by the hardware properly. We encode the file pointer of the virtual surface allocate using a special cookie value, called PCIGART_FILE_PRIV. On the last close, we release that surface. Just to be doubly safe, we run the pcigart table setup with the main surface control register clear. Based upon ideas from David Airlie and Ben Benjamin Herrenschmidt. Signed-off-by: David S. Miller da...@davemloft.net diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c index e235778..119e642 100644 --- a/drivers/gpu/drm/radeon/radeon_cp.c +++ b/drivers/gpu/drm/radeon/radeon_cp.c @@ -909,6 +909,46 @@ static void radeon_set_pcigart(drm_radeon_private_t * dev_priv, int on) } } +static int radeon_setup_pcigart_surface(drm_radeon_private_t *dev_priv) +{ + struct drm_ati_pcigart_info *gart_info = dev_priv-gart_info; + struct radeon_virt_surface *vp; + int i; + + for (i = 0; i RADEON_MAX_SURFACES * 2; i++) { + if (!dev_priv-virt_surfaces[i].file_priv || + dev_priv-virt_surfaces[i].file_priv == PCIGART_FILE_PRIV) + break; + } + if (i = 2 * RADEON_MAX_SURFACES) + return -ENOMEM; + vp = dev_priv-virt_surfaces[i]; + + for (i = 0; i RADEON_MAX_SURFACES; i++) { + struct radeon_surface *sp = dev_priv-surfaces[i]; + if (sp-refcount) + continue; + + vp-surface_index = i; + vp-lower = gart_info-bus_addr; + vp-upper = vp-lower + gart_info-table_size; + vp-flags = 0; + vp-file_priv = PCIGART_FILE_PRIV; + + sp-refcount = 1; + sp-lower = vp-lower; + sp-upper = vp-upper; + sp-flags = 0; + + RADEON_WRITE(RADEON_SURFACE0_INFO + 16 * i, sp-flags); + RADEON_WRITE(RADEON_SURFACE0_LOWER_BOUND + 16 * i, sp-lower); + RADEON_WRITE(RADEON_SURFACE0_UPPER_BOUND + 16 * i, sp-upper); + return 0; + } + + return -ENOMEM; +} + static int radeon_do_init_cp(struct drm_device *dev, drm_radeon_init_t *init, struct drm_file *file_priv) { @@ -1202,6 +1242,9 @@ static int radeon_do_init_cp(struct drm_device *dev, drm_radeon_init_t *init, } else #endif { + u32 sctrl; + int ret; + dev_priv-gart_info.table_mask = DMA_BIT_MASK(32); /* if we have an offset set from userspace */ if (dev_priv-pcigart_offset_set) { @@ -1243,12 +1286,25 @@ static int radeon_do_init_cp(struct drm_device *dev, drm_radeon_init_t *init, } } - if (!drm_ati_pcigart_init(dev, dev_priv-gart_info)) { + sctrl = RADEON_READ(RADEON_SURFACE_CNTL); + RADEON_WRITE(RADEON_SURFACE_CNTL, 0); + ret = drm_ati_pcigart_init(dev, dev_priv-gart_info); + RADEON_WRITE(RADEON_SURFACE_CNTL, sctrl); + + if (!ret) { DRM_ERROR(failed to init PCI GART!\n); radeon_do_cleanup_cp(dev); return -ENOMEM; } + ret = radeon_setup_pcigart_surface(dev_priv); + if (ret) { + DRM_ERROR(failed to setup GART surface!\n); + drm_ati_pcigart_cleanup(dev, dev_priv-gart_info); + radeon_do_cleanup_cp(dev); + return ret; + } + /* Turn on PCI GART */ radeon_set_pcigart(dev_priv, 1); } diff --git a/drivers/gpu/drm/radeon/radeon_drv.h b/drivers/gpu/drm/radeon/radeon_drv.h index 9b60a26..ecfd414 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.h +++ b/drivers/gpu/drm/radeon/radeon_drv.h @@ -217,6 +217,7 @@ struct radeon_virt_surface { u32 upper; u32 flags; struct drm_file *file_priv; +#define PCIGART_FILE_PRIV ((void *) -1L) }; #define RADEON_FLUSH_EMITED(1 0) diff --git a/drivers/gpu/drm/radeon/radeon_state.c b/drivers/gpu/drm/radeon/radeon_state.c index 03fea43..043293a 100644 --- a/drivers/gpu/drm/radeon/radeon_state.c +++ b/drivers/gpu/drm/radeon/radeon_state.c @@ -3155,6 +3155,7 @@ void radeon_driver_preclose(struct drm_device *dev, struct drm_file *file_priv) void radeon_driver_lastclose(struct drm_device *dev) { + radeon_surfaces_release(PCIGART_FILE_PRIV, dev-dev_private); radeon_do_release(dev); } -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the
Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
From: David Miller da...@davemloft.net Date: Sat, 14 Feb 2009 01:11:45 -0800 (PST) From: Benjamin Herrenschmidt b...@kernel.crashing.org Date: Sat, 14 Feb 2009 20:07:54 +1100 We can do that by registering a surface from the kernel to cover the GART I suppose, and clean things a bit so that when using the DRI, X doesn't touch the surface registers -at all- and leaves it to the kernel. That actually sounds like a good idea. Ok I have it working, patch coming right up. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH]: drm: radeon: Use surface for PCI GART table.
From: Michel Dänzer mic...@daenzer.net Date: Sat, 14 Feb 2009 10:59:59 +0100 On Sat, 2009-02-14 at 01:51 -0800, David Miller wrote: This allocates a physical surface for the PCI GART table, this way no matter what other surface configurations exist the GART table will always be seen by the hardware properly. BTW, I don't think the swapping settings affect GPU access to the table, only CPU access, but this is a good solution anyway. They absolutely and positively do effect GPU access to the table. I've proven it with many conclusive tests over the past 3 days. This is only really necessary on big endian, but other than that: Acked-By: Michel Dänzer daen...@vmware.com Thanks. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
From: Benjamin Herrenschmidt b...@kernel.crashing.org Date: Thu, 12 Feb 2009 21:35:59 +1100 Oh BTW something else to be careful with, though I suppose it's working some what by accident right now... when the GART is in the frame buffer it gets applied the current fb swapper setting... ouch ! So it might be a good idea, if we're going to use DRM_READ/WRITE32 which afaik are readl/writel (ie, swapping) to make sure we at least temporarily disable that swapper while whacking the GART. So I've narrowed down the final problems I'm having. It's the surface swapping settings indeed. Running Xorg at depth 8, the CP can execute indirect buffers just fine, no code changes necessary. However, the CP hangs on indirect execution for depth 16 and 24. But these depths work if I hard disable the surface byte swapping settings in the radeon Xorg driver. I did some research, and it does appear that the GART does read the PTEs from the VRAM using the Host Data Path. This means the surface control byte swapping settings are applied. So for depths of 16 and 24, the GART is reading garbage PTEs. And that's why the CP hangs. I have no idea how to handle this properly. Not only does the Xorg ATI driver set the swapping settings at an arbitray point, it also mucks with them dynamically f.e. in the render helper functions. The only way this can work properly is to choose one surface swapping setting, set the VRAM GART table so that the GART sees the PTEs in the correct format, and retain those settings throughout. It seems like, in at least R5xx, they tried to add a control bit in the PCI-E GART registers to handle this. Bit 6 in PCI-E indirect register 0x10 is named 'GART_RDREQPATH_SEL' and has description: Read request path 0=HDP 1=Direct Rd Req to MC This seems to be intended to be a way to have the GART PTE reads bypass the full Host Data Path (and thus potential surface byte swaps), by setting this bit to 1. I tried doing that, but it doesn't help my RV370 so perhaps older chips don't support this bit. It's hard to tell as this is the area where all of AMD's published GPU documents are severely lacking. I tested whether reading back the PCIE_TX_GART_CNTL register shows bit 6 after I try to write it, and it always reads back zero. There are even more complications, the VT enter/exit code in the Xorg ATI driver tries to save and restore the VRAM GART table for PCI-E cards. But this: 1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead of the constant 4096 to determine how many GART entries there are. The PCI GART entries map 4096 bytes, always. So using getpagesize() is wrong. (see RADEONDRIGetPciAperTableSize) I have this fixed in my local tree. 2) It doesn't check the surface byte swapping settings, so it could be saving in one byte order and restoing in another. I guess we could force RADEON_SURFACE_CNTL to zero around the two memcpy()'s done in radeon_driver.c But really this whole area is a mess, and I know KMS is coming to fix this, but even in the traditional XORG/DRM layout XORG has no business keeping the GART table uptodate across VT switches. It should be calling into the DRM with an ioctl to write the GART table out to VRAM again. Finally there is a huge issue with how the Xorg ATI driver resets the chip using the RBBM. It soft resets the CP, but this resets the ring read pointer. However, nothing is done to make sure the DRM resync's the ring write pointer (which remains unchanged by a soft CP reset) as well as it's internal software ring state. The result is that on the very next CP command submission, the CP re-executes everything from ring entry zero until wherever the DRM things the write pointer was at the time of the CP soft reset. Any time the Xorg ATI driver does a CP soft reset, it should do CP stop and resume calls to the DRM to resync the ring pointers. And this does not appear to be happening where it needs to be happening. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH]: drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init().
drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init(). The variable 'max_pages' is ambiguous. There are two concepts of pages being used in this function. First, we have ATI GART pages which are always 4096 bytes. Then, we have system pages which are of size PAGE_SIZE. Eliminate the confusion by creating max_ati_pages and max_real_pages. Calculate and use them as appropriate. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/ati_pcigart.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c index 7972ec8..4d86a62 100644 --- a/drivers/gpu/drm/ati_pcigart.c +++ b/drivers/gpu/drm/ati_pcigart.c @@ -102,7 +102,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga u32 *pci_gart, page_base, gart_idx; dma_addr_t bus_address = 0; int i, j, ret = 0; - int max_pages; + int max_ati_pages, max_real_pages; if (!entry) { DRM_ERROR(no scatter/gather memory!\n); @@ -130,14 +130,15 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga pci_gart = (u32 *) address; - max_pages = (gart_info-table_size / sizeof(u32)); - pages = (entry-pages = max_pages) - ? entry-pages : max_pages; + max_ati_pages = (gart_info-table_size / sizeof(u32)); + max_real_pages = max_ati_pages / (PAGE_SIZE / ATI_PCIGART_PAGE_SIZE); + pages = (entry-pages = max_real_pages) + ? entry-pages : max_real_pages; if (gart_info-gart_table_location == DRM_ATI_GART_MAIN) { - memset(pci_gart, 0, max_pages * sizeof(u32)); + memset(pci_gart, 0, max_ati_pages * sizeof(u32)); } else { - for (gart_idx = 0; gart_idx max_pages; gart_idx++) + for (gart_idx = 0; gart_idx max_ati_pages; gart_idx++) DRM_WRITE32(map, gart_idx * sizeof(u32), 0); } -- 1.6.1.2.350.g88cc -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
The PCI GART table initialization code treats the GART table mapping unconditionally as a kernel virtual address. But it could be in the framebuffer, for example, and thus we're dealing with a PCI MEM space ioremap() cookie. Treating that as a virtual address is illegal and will crash some system types (such as sparc64 where the ioremap() return value is actually a physical I/O address). So access the area correctly, using gart_info-gart_table_location as our guide. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/ati_pcigart.c | 26 -- 1 files changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c index c533d0c..2cd827a 100644 --- a/drivers/gpu/drm/ati_pcigart.c +++ b/drivers/gpu/drm/ati_pcigart.c @@ -95,10 +95,11 @@ EXPORT_SYMBOL(drm_ati_pcigart_cleanup); int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *gart_info) { + struct drm_local_map *map = gart_info-mapping; struct drm_sg_mem *entry = dev-sg; void *address = NULL; unsigned long pages; - u32 *pci_gart, page_base; + u32 *pci_gart, page_base, gart_idx; dma_addr_t bus_address = 0; int i, j, ret = 0; int max_pages; @@ -133,8 +134,14 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga pages = (entry-pages = max_pages) ? entry-pages : max_pages; - memset(pci_gart, 0, max_pages * sizeof(u32)); + if (gart_info-gart_table_location == DRM_ATI_GART_MAIN) { + memset(pci_gart, 0, max_pages * sizeof(u32)); + } else { + for (gart_idx = 0; gart_idx max_pages; gart_idx++) + DRM_WRITE32(map, gart_idx * sizeof(u32), 0); + } + gart_idx = 0; for (i = 0; i pages; i++) { /* we need to support large memory configurations */ entry-busaddr[i] = pci_map_page(dev-pdev, entry-pagelist[i], @@ -149,19 +156,26 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga page_base = (u32) entry-busaddr[i]; for (j = 0; j (PAGE_SIZE / ATI_PCIGART_PAGE_SIZE); j++) { + u32 val; + switch(gart_info-gart_reg_if) { case DRM_ATI_GART_IGP: - *pci_gart = cpu_to_le32((page_base) | 0xc); + val = page_base | 0xc; break; case DRM_ATI_GART_PCIE: - *pci_gart = cpu_to_le32((page_base 8) | 0xc); + val = (page_base 8) | 0xc; break; default: case DRM_ATI_GART_PCI: - *pci_gart = cpu_to_le32(page_base); + val = page_base; break; } - pci_gart++; + if (gart_info-gart_table_location == + DRM_ATI_GART_MAIN) + pci_gart[gart_idx] = cpu_to_le32(val); + else + DRM_WRITE32(map, gart_idx * sizeof(u32), val); + gart_idx++; page_base += ATI_PCIGART_PAGE_SIZE; } } -- 1.6.1.2.350.g88cc -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 2/5]: drm: ati_pcigart: Need to use PCI_DMA_BIDIRECTIONAL.
The buffers mapped by the PCI GART can be written to by the device, not just read. For example, this happens via the RB_RPTR writeback on Radeon. So we can't use PCI_DMA_TODEVICE else we'll get protection faults on IOMMU platforms. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/ati_pcigart.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c index 2cd827a..7972ec8 100644 --- a/drivers/gpu/drm/ati_pcigart.c +++ b/drivers/gpu/drm/ati_pcigart.c @@ -77,7 +77,7 @@ int drm_ati_pcigart_cleanup(struct drm_device *dev, struct drm_ati_pcigart_info if (!entry-busaddr[i]) break; pci_unmap_page(dev-pdev, entry-busaddr[i], -PAGE_SIZE, PCI_DMA_TODEVICE); +PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); } if (gart_info-gart_table_location == DRM_ATI_GART_MAIN) @@ -145,7 +145,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga for (i = 0; i pages; i++) { /* we need to support large memory configurations */ entry-busaddr[i] = pci_map_page(dev-pdev, entry-pagelist[i], -0, PAGE_SIZE, PCI_DMA_TODEVICE); +0, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); if (entry-busaddr[i] == 0) { DRM_ERROR(unable to map PCIGART pages!\n); drm_ati_pcigart_cleanup(dev, gart_info); -- 1.6.1.2.350.g88cc -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 3/5]: drm: radeon: Fix ring_rptr accesses.
The memory behind ring_rptr can either be in ioremapped memory or a vmalloc() normal kernel memory buffer. However, the code unconditionally uses DRM_{READ,WRITE}32() (and thus readl() and writel()) to access it. Basically, if RADEON_IS_AGP then it's ioremap()'d memory else it's vmalloc'd memory. Adjust all of the ring_rptr access code as needed. While we're here, kill the 'scratch' pointer in drm_radeon_private. It's only used in the one place where it is initialized. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/radeon/radeon_cp.c| 70 +++-- drivers/gpu/drm/radeon/radeon_drv.h | 17 drivers/gpu/drm/radeon/radeon_state.c |6 +- 3 files changed, 70 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c index ae4ffa1..9053755 100644 --- a/drivers/gpu/drm/radeon/radeon_cp.c +++ b/drivers/gpu/drm/radeon/radeon_cp.c @@ -43,6 +43,52 @@ static int radeon_do_cleanup_cp(struct drm_device * dev); static void radeon_do_cp_start(drm_radeon_private_t * dev_priv); +static u32 radeon_read_ring_rptr(drm_radeon_private_t *dev_priv, u32 off) +{ + u32 val; + + if (dev_priv-flags RADEON_IS_AGP) { + val = DRM_READ32(dev_priv-ring_rptr, off); + } else { + val = *(((volatile u32 *) +dev_priv-ring_rptr-handle) + + (off / sizeof(u32))); + val = le32_to_cpu(val); + } + return val; +} + +u32 radeon_get_ring_head(drm_radeon_private_t *dev_priv) +{ + if (dev_priv-writeback_works) + return radeon_read_ring_rptr(dev_priv, 0); + else + return RADEON_READ(RADEON_CP_RB_RPTR); +} + +static void radeon_write_ring_rptr(drm_radeon_private_t *dev_priv, u32 off, u32 val) +{ + if (dev_priv-flags RADEON_IS_AGP) + DRM_WRITE32(dev_priv-ring_rptr, off, val); + else + *(((volatile u32 *) dev_priv-ring_rptr-handle) + + (off / sizeof(u32))) = cpu_to_le32(val); +} + +void radeon_set_ring_head(drm_radeon_private_t *dev_priv, u32 val) +{ + radeon_write_ring_rptr(dev_priv, 0, val); +} + +u32 radeon_get_scratch(drm_radeon_private_t *dev_priv, int index) +{ + if (dev_priv-writeback_works) + return radeon_read_ring_rptr(dev_priv, +RADEON_SCRATCHOFF(index)); + else + return RADEON_READ(RADEON_SCRATCH_REG0 + 4*index); +} + static u32 R500_READ_MCIND(drm_radeon_private_t *dev_priv, int addr) { u32 ret; @@ -647,10 +693,6 @@ static void radeon_cp_init_ring_buffer(struct drm_device * dev, RADEON_WRITE(RADEON_SCRATCH_ADDR, RADEON_READ(RADEON_CP_RB_RPTR_ADDR) + RADEON_SCRATCH_REG_OFFSET); - dev_priv-scratch = ((__volatile__ u32 *) -dev_priv-ring_rptr-handle + -(RADEON_SCRATCH_REG_OFFSET / sizeof(u32))); - RADEON_WRITE(RADEON_SCRATCH_UMSK, 0x7); /* Turn on bus mastering */ @@ -668,13 +710,13 @@ static void radeon_cp_init_ring_buffer(struct drm_device * dev, RADEON_WRITE(RADEON_BUS_CNTL, tmp); } /* PCIE cards appears to not need this */ - dev_priv-scratch[0] = 0; + radeon_write_ring_rptr(dev_priv, RADEON_SCRATCHOFF(0), 0); RADEON_WRITE(RADEON_LAST_FRAME_REG, 0); - dev_priv-scratch[1] = 0; + radeon_write_ring_rptr(dev_priv, RADEON_SCRATCHOFF(1), 0); RADEON_WRITE(RADEON_LAST_DISPATCH_REG, 0); - dev_priv-scratch[2] = 0; + radeon_write_ring_rptr(dev_priv, RADEON_SCRATCHOFF(2), 0); RADEON_WRITE(RADEON_LAST_CLEAR_REG, 0); radeon_do_wait_for_idle(dev_priv); @@ -698,12 +740,15 @@ static void radeon_test_writeback(drm_radeon_private_t * dev_priv) /* Writeback doesn't seem to work everywhere, test it here and possibly * enable it if it appears to work */ - DRM_WRITE32(dev_priv-ring_rptr, RADEON_SCRATCHOFF(1), 0); + radeon_write_ring_rptr(dev_priv, RADEON_SCRATCHOFF(1), 0); + RADEON_WRITE(RADEON_SCRATCH_REG1, 0xdeadbeef); for (tmp = 0; tmp dev_priv-usec_timeout; tmp++) { - if (DRM_READ32(dev_priv-ring_rptr, RADEON_SCRATCHOFF(1)) == - 0xdeadbeef) + u32 val; + + val = radeon_read_ring_rptr(dev_priv, RADEON_SCRATCHOFF(1)); + if (val == 0xdeadbeef) break; DRM_UDELAY(1); } @@ -1540,7 +1585,7 @@ struct drm_buf *radeon_freelist_get(struct drm_device * dev) start = dev_priv-last_buf; for (t = 0; t dev_priv-usec_timeout; t++) { - u32 done_age = GET_SCRATCH(1); + u32 done_age = GET_SCRATCH(dev_priv, 1); DRM_DEBUG(done_age = %d\n, done_age); for (i = start; i
[PATCH 5/5]: drm: radeon: Fix calculation of RB_RPTR_ADDR in non-AGP case.
The address needs to be a GART relative address, rather than a PCI DMA address. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/radeon/radeon_cp.c | 15 --- 1 files changed, 4 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c index 9053755..e235778 100644 --- a/drivers/gpu/drm/radeon/radeon_cp.c +++ b/drivers/gpu/drm/radeon/radeon_cp.c @@ -655,17 +655,10 @@ static void radeon_cp_init_ring_buffer(struct drm_device * dev, } else #endif { - struct drm_sg_mem *entry = dev-sg; - unsigned long tmp_ofs, page_ofs; - - tmp_ofs = dev_priv-ring_rptr-offset - - (unsigned long)dev-sg-virtual; - page_ofs = tmp_ofs PAGE_SHIFT; - - RADEON_WRITE(RADEON_CP_RB_RPTR_ADDR, entry-busaddr[page_ofs]); - DRM_DEBUG(ring rptr: offset=0x%08lx handle=0x%08lx\n, - (unsigned long)entry-busaddr[page_ofs], - entry-handle + tmp_ofs); + RADEON_WRITE(RADEON_CP_RB_RPTR_ADDR, +dev_priv-ring_rptr-offset +- ((unsigned long) dev-sg-virtual) ++ dev_priv-gart_vm_start); } /* Set ring buffer size */ -- 1.6.1.2.350.g88cc -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 4/5]: drm: radeon: Fix RADEON_*_EMITED defines.
These are not supposed to be booleans, they are supposed to be bit masks. Signed-off-by: David S. Miller da...@davemloft.net --- drivers/gpu/drm/radeon/radeon_drv.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.h b/drivers/gpu/drm/radeon/radeon_drv.h index a253cf0..9b60a26 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.h +++ b/drivers/gpu/drm/radeon/radeon_drv.h @@ -219,8 +219,8 @@ struct radeon_virt_surface { struct drm_file *file_priv; }; -#define RADEON_FLUSH_EMITED(1 0) -#define RADEON_PURGE_EMITED(1 1) +#define RADEON_FLUSH_EMITED(1 0) +#define RADEON_PURGE_EMITED(1 1) struct drm_radeon_master_private { drm_local_map_t *sarea; -- 1.6.1.2.350.g88cc -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
From: Benjamin Herrenschmidt b...@kernel.crashing.org Date: Thu, 12 Feb 2009 21:35:59 +1100 Oh BTW something else to be careful with, though I suppose it's working some what by accident right now... when the GART is in the frame buffer it gets applied the current fb swapper setting... ouch ! So it might be a good idea, if we're going to use DRM_READ/WRITE32 which afaik are readl/writel (ie, swapping) to make sure we at least temporarily disable that swapper while whacking the GART. Cute. I wonder if this is what is tripping me up. But, looking more closely, it appears that: 1) The kernel radeon framebuffer driver doesn't mess with the framebuffer endianness setting. 2) On = R300 (which my chip is), Xorg leaves it alone too. -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
From: Dave Airlie airl...@gmail.com Date: Thu, 12 Feb 2009 21:23:13 +1000 are you on a PCI or PCIE card, I've no idea what buses you have on sparc64. On the PCI cards the GART table will always be in main memory. PCIE always in VRAM. PCI-E -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 0/5]: ATI/RADEON DRM bug fixes...
From: Dave Airlie airl...@gmail.com Date: Thu, 12 Feb 2009 21:26:51 +1000 On Thu, Feb 12, 2009 at 8:15 PM, David Miller da...@davemloft.net wrote: David, this work is against your drm-next branch. Here are a collection of bug fixes for the Radeon DRM support. Most of them have to do with trying to access kernel virtual addresses using DRM_READ32() and DRM_WRITE32(). With these patches at least the writeback test works on sparc64 and the CP is able to process commands in the ring. I'm now diagnosing some further problem that's preventing Xorg from functioning fully but I should be able to diagnose that soon. You'll probably love patch #4 in this series, and it probably explains all kinds of weird problems people run into with DRM on radeon cards. Thanks Dave, These all look great, the EMITED one has potential to regress something by fixing it, but I think it might actually make things better, I'll push all of these to Linus asap as they all fix real bugs. I've put them into my drm-fixes branch and I'll try and re-review and get some testing tomorrow when I'm more awake. Note that these depend upon Ben's drm_addmap and resource fixed in your drm-next branch, sparc64 was also broken by those problems. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: drm + 4GB RAM + swiotlb = drm craps out
From: Dave Airlie [EMAIL PROTECTED] Date: Mon, 2 Apr 2007 09:44:41 +1000 Okay I've got a bug reported before and now again about 4GB + radeon blows up the DRM... on Intel hw... What the drm currently does for the PCI GART table is it allocates a chunk of memory (8MB) with vmalloc_32(), then when it decides to use it it goes through every page of it calls pci_map_single() (with PCI_DMA_TODEVICE, which is probably wrong...) with every page from the vmalloc mapping and puts the bus addresses of the pages into the PCI GART table on the GPU. So when swiotlb happens, as you can guess it all falls apart as the drm never calls sync functions at any stage... You would have hit this on any platform that does caching in the PCI controller as well. The main problem is the ring buffer and scratch write back, these values are read/write from both the CPU and GPU quite a lot, so this leads me to think I should really just be using dma_alloc_coherent for the whole lot, however this is an 8MB mapping and possibly could be getting larger in the future and dynamic as we do dynamic PCIEGART support for the radeons... So I suppose I'm asking for ideas on the correct way to do this, and perhaps any quick way to patch up the problem I'm seeing now by making swiotlb not get involved Coherent memory was created for precisely the case where the cpu and the device frequently access the memory. 8MB is indeed a lot for the kind of allocation that the coherent DMA implementation uses. Does it really have to be all in one big 8MB chunk? I doubt it. Perhaps you can therefore create multiple DMA pools instead? See include/linux/dmapool.h - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: drm + 4GB RAM + swiotlb = drm craps out
From: Dave Airlie [EMAIL PROTECTED] Date: Mon, 2 Apr 2007 14:08:13 +1000 So when swiotlb happens, as you can guess it all falls apart as the drm never calls sync functions at any stage... You would have hit this on any platform that does caching in the PCI controller as well. We must not have a great intersect of radeon and such systems.. It might explain why my machine hung when I tried to use radeon with DRM on my sparc64 workstation :-) I have investigating that on my todo list. It currently is required to be in a big 8MB chunk as it gets chopped up by the X server not the kernel, so kernel needs to allocate pages to back it when X inits, yes this is ugly, no it can't be fixed without time-travelling and fixing deployed X servers... Really we probably only need the ring buffer to be in coherent memory, the rest of the stuff is used for DMA buffers which are mainly filled by the CPU and read by the GPU. However I cannot change this without breaking X, the solution is really to use TTM for this sort of stuff I'm a bit worried as the AGP driver now uses vmalloc_32 which really is a meaningless interface on 64-bit systems.. I don't know what to recommend to you, getting 8MB of linear memory really just isn't practical. Perhaps we'll have to create something ugly like vmalloc_nobounce(). Remind me again why you're ending up with swiotlb'd pages? vmalloc_32() uses GFP_KERNEL which should use entirely lowmem and thus RAM below 4GB and not anything which should need bounce buffering. You should only get swiotlb'd pages if __GFP_HIGHMEM were set in the gfp flags. Are you expecting to be able to virtually remap these pages in PCI space as one huge 8MB chunk too and that's how swiotlb gets involved? That won't work, sorry... - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: drm + 4GB RAM + swiotlb = drm craps out
From: Dave Airlie [EMAIL PROTECTED] Date: Mon, 2 Apr 2007 15:15:48 +1000 Perhaps we'll have to create something ugly like vmalloc_nobounce(). Remind me again why you're ending up with swiotlb'd pages? vmalloc_32() uses GFP_KERNEL which should use entirely lowmem and thus RAM below 4GB and not anything which should need bounce buffering. On a 64-bit machine GFP_KERNEL can give me any memory... it all works fine on 32-bit highmem kernel as I don't get highmem... I really need __GFP_DMA32 memory but we don't have a generic allocator that gives this out that I can see.. That clears things up thanks. Perhaps the other uses of vmalloc_32() want GFP_32 semantics too, although I didn't check. - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel