Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Thu, Apr 2, 2015 at 12:45 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Wed, Apr 01, 2015 at 05:04:08PM -0700, Andy Lutomirski wrote: On Wed, Apr 1, 2015 at 4:52 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 28, 2015 at 02:23:34PM +0200, Ville Syrjälä wrote: On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote: On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? The vesafb solution is bad since you'll only end up covering only the first 4MB of the framebuffer instead of the almost 8MB you want. Which in practice will mean throwing away half the VRAM since you really don't want the massive performance hit from accessing it as UC. And that would mean giving up decent display resolutions as well :( And the other option of trying to cover the remainder with multiple ever smaller MTRRs doesn't work either since you'll run out of MTRRs very quickly. This is precisely why I used the hole method in atyfb in the first place. I don't really like the idea of any new mtrr code not supporting that use case, especially as these things tend to be present in older machines where PAT isn't an option. According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6, non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have an effective memory type of UC. This is true but non-PAT systems that use just ioremap() will default to _PAGE_CACHE_MODE_UC_MINUS, not _PAGE_CACHE_MODE_UC, and _PAGE_CACHE_MODE_UC_MINUS on Linux has PCD = 1, PWT = 0. The list comes from: uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = { [_PAGE_CACHE_MODE_WB ] = 0 | 0, [_PAGE_CACHE_MODE_WC ] = _PAGE_PWT | 0, [_PAGE_CACHE_MODE_UC_MINUS] = 0 | _PAGE_PCD, [_PAGE_CACHE_MODE_UC ] = _PAGE_PWT | _PAGE_PCD, [_PAGE_CACHE_MODE_WT ] = 0 | _PAGE_PCD, [_PAGE_CACHE_MODE_WP ] = 0 | _PAGE_PCD, }; This can better be read here: PAT |PCD ||PWT ||| 000 WB _PAGE_CACHE_MODE_WB 001 WC _PAGE_CACHE_MODE_WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS 011 UC _PAGE_CACHE_MODE_UC On x86 ioremap() defaults to ioremap_nocache() and right now that uses _PAGE_CACHE_MODE_UC_MINUS not _PAGE_CACHE_MODE_UC. We have two cases to consider for non-PAT systems then: a) Right now as ioremap() and ioremap_nocache() default to _PAGE_CACHE_MODE_UC_MINUS on x86. In this case using a WC MTRR seems to use PWT=0, PCD=1, and table table
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Wed, Apr 01, 2015 at 05:04:08PM -0700, Andy Lutomirski wrote: On Wed, Apr 1, 2015 at 4:52 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 28, 2015 at 02:23:34PM +0200, Ville Syrjälä wrote: On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote: On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? The vesafb solution is bad since you'll only end up covering only the first 4MB of the framebuffer instead of the almost 8MB you want. Which in practice will mean throwing away half the VRAM since you really don't want the massive performance hit from accessing it as UC. And that would mean giving up decent display resolutions as well :( And the other option of trying to cover the remainder with multiple ever smaller MTRRs doesn't work either since you'll run out of MTRRs very quickly. This is precisely why I used the hole method in atyfb in the first place. I don't really like the idea of any new mtrr code not supporting that use case, especially as these things tend to be present in older machines where PAT isn't an option. According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6, non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have an effective memory type of UC. This is true but non-PAT systems that use just ioremap() will default to _PAGE_CACHE_MODE_UC_MINUS, not _PAGE_CACHE_MODE_UC, and _PAGE_CACHE_MODE_UC_MINUS on Linux has PCD = 1, PWT = 0. The list comes from: uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = { [_PAGE_CACHE_MODE_WB ] = 0 | 0, [_PAGE_CACHE_MODE_WC ] = _PAGE_PWT | 0, [_PAGE_CACHE_MODE_UC_MINUS] = 0 | _PAGE_PCD, [_PAGE_CACHE_MODE_UC ] = _PAGE_PWT | _PAGE_PCD, [_PAGE_CACHE_MODE_WT ] = 0 | _PAGE_PCD, [_PAGE_CACHE_MODE_WP ] = 0 | _PAGE_PCD, }; This can better be read here: PAT |PCD ||PWT ||| 000 WB _PAGE_CACHE_MODE_WB 001 WC _PAGE_CACHE_MODE_WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS 011 UC _PAGE_CACHE_MODE_UC On x86 ioremap() defaults to ioremap_nocache() and right now that uses _PAGE_CACHE_MODE_UC_MINUS not _PAGE_CACHE_MODE_UC. We have two cases to consider for non-PAT systems then: a) Right now as ioremap() and ioremap_nocache() default to _PAGE_CACHE_MODE_UC_MINUS on x86. In this case using a WC MTRR seems to use PWT=0, PCD=1, and table table 11-6 on non-PAT systems seems to place this situation as implementation
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Sat, Mar 28, 2015 at 02:23:34PM +0200, Ville Syrjälä wrote: On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote: On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? The vesafb solution is bad since you'll only end up covering only the first 4MB of the framebuffer instead of the almost 8MB you want. Which in practice will mean throwing away half the VRAM since you really don't want the massive performance hit from accessing it as UC. And that would mean giving up decent display resolutions as well :( And the other option of trying to cover the remainder with multiple ever smaller MTRRs doesn't work either since you'll run out of MTRRs very quickly. This is precisely why I used the hole method in atyfb in the first place. I don't really like the idea of any new mtrr code not supporting that use case, especially as these things tend to be present in older machines where PAT isn't an option. According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6, non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have an effective memory type of UC. This is true but non-PAT systems that use just ioremap() will default to _PAGE_CACHE_MODE_UC_MINUS, not _PAGE_CACHE_MODE_UC, and _PAGE_CACHE_MODE_UC_MINUS on Linux has PCD = 1, PWT = 0. The list comes from: uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = { [_PAGE_CACHE_MODE_WB ] = 0 | 0, [_PAGE_CACHE_MODE_WC ] = _PAGE_PWT | 0, [_PAGE_CACHE_MODE_UC_MINUS] = 0 | _PAGE_PCD, [_PAGE_CACHE_MODE_UC ] = _PAGE_PWT | _PAGE_PCD, [_PAGE_CACHE_MODE_WT ] = 0 | _PAGE_PCD, [_PAGE_CACHE_MODE_WP ] = 0 | _PAGE_PCD, }; This can better be read here: PAT |PCD ||PWT ||| 000 WB _PAGE_CACHE_MODE_WB 001 WC _PAGE_CACHE_MODE_WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS 011 UC _PAGE_CACHE_MODE_UC On x86 ioremap() defaults to ioremap_nocache() and right now that uses _PAGE_CACHE_MODE_UC_MINUS not _PAGE_CACHE_MODE_UC. We have two cases to consider for non-PAT systems then: a) Right now as ioremap() and ioremap_nocache() default to _PAGE_CACHE_MODE_UC_MINUS on x86. In this case using a WC MTRR seems to use PWT=0, PCD=1, and table table 11-6 on non-PAT systems seems to place this situation as implementation defined and not encouraged. a) when commit de33c442e x86 PAT: fix performance drop for glx, use UC minus for
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Wed, Apr 1, 2015 at 4:52 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 28, 2015 at 02:23:34PM +0200, Ville Syrjälä wrote: On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote: On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? The vesafb solution is bad since you'll only end up covering only the first 4MB of the framebuffer instead of the almost 8MB you want. Which in practice will mean throwing away half the VRAM since you really don't want the massive performance hit from accessing it as UC. And that would mean giving up decent display resolutions as well :( And the other option of trying to cover the remainder with multiple ever smaller MTRRs doesn't work either since you'll run out of MTRRs very quickly. This is precisely why I used the hole method in atyfb in the first place. I don't really like the idea of any new mtrr code not supporting that use case, especially as these things tend to be present in older machines where PAT isn't an option. According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6, non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have an effective memory type of UC. This is true but non-PAT systems that use just ioremap() will default to _PAGE_CACHE_MODE_UC_MINUS, not _PAGE_CACHE_MODE_UC, and _PAGE_CACHE_MODE_UC_MINUS on Linux has PCD = 1, PWT = 0. The list comes from: uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = { [_PAGE_CACHE_MODE_WB ] = 0 | 0, [_PAGE_CACHE_MODE_WC ] = _PAGE_PWT | 0, [_PAGE_CACHE_MODE_UC_MINUS] = 0 | _PAGE_PCD, [_PAGE_CACHE_MODE_UC ] = _PAGE_PWT | _PAGE_PCD, [_PAGE_CACHE_MODE_WT ] = 0 | _PAGE_PCD, [_PAGE_CACHE_MODE_WP ] = 0 | _PAGE_PCD, }; This can better be read here: PAT |PCD ||PWT ||| 000 WB _PAGE_CACHE_MODE_WB 001 WC _PAGE_CACHE_MODE_WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS 011 UC _PAGE_CACHE_MODE_UC On x86 ioremap() defaults to ioremap_nocache() and right now that uses _PAGE_CACHE_MODE_UC_MINUS not _PAGE_CACHE_MODE_UC. We have two cases to consider for non-PAT systems then: a) Right now as ioremap() and ioremap_nocache() default to _PAGE_CACHE_MODE_UC_MINUS on x86. In this case using a WC MTRR seems to use PWT=0, PCD=1, and table table 11-6 on non-PAT systems seems to place this situation as implementation defined and not encouraged. a) when commit de33c442e x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range() gets
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote: On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? The vesafb solution is bad since you'll only end up covering only the first 4MB of the framebuffer instead of the almost 8MB you want. Which in practice will mean throwing away half the VRAM since you really don't want the massive performance hit from accessing it as UC. And that would mean giving up decent display resolutions as well :( And the other option of trying to cover the remainder with multiple ever smaller MTRRs doesn't work either since you'll run out of MTRRs very quickly. This is precisely why I used the hole method in atyfb in the first place. I don't really like the idea of any new mtrr code not supporting that use case, especially as these things tend to be present in older machines where PAT isn't an option. According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6, non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have an effective memory type of UC. Hence my suggestion to add ioremap_x86_uc and/or set_memory_x86_uc to punch a UC hole in an otherwise WC MTRR-covered region. OK I think I get it now. And I take it this would hopefully only be used for non-PAT systems? Would there be a use case for PAT systems? I wonder if we can wrap this under some APIs to make it clean and hide this dirty thing behind the scenes, it seems a fragile and error prone and my hope would be that we won't need more specialization in this area for PAT systems. One potential complication is kernel vs. userspace mmap. MTRR applies to the physical address, but PAT applies to the virtual address, so with the WC MTRR you get WC for userspace for free as well. Also the userspace mmaps request will have the length of smem_len (at most), so it won't be the nice power of two in that case. Also on PAT systems w/o a BIOS provided WC MTRR, the fbdev mmap seems to be total crap at the moment. IIRC I have a patch to fix things a bit... From 4e6d70d223f35953c8a11a58cf3376a8a001fa4f Mon Sep 17 00:00:00 2001 From: Ville Syrjala syrj...@sci.fi Date: Fri, 15 Apr 2011 04:02:43 +0300 Subject: [PATCH] fb: writecombine fb --- drivers/video/fbdev/core/fbmem.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c index 0705d88..ecbde0e 100644 --- a/drivers/video/fbdev/core/fbmem.c +++ b/drivers/video/fbdev/core/fbmem.c @@ -1396,6 +1396,7 @@ fb_mmap(struct file *file, struct vm_area_struct * vma) unsigned long mmio_pgoff; unsigned long start; u32 len; + bool mmio = false; if (!info) return -ENODEV; @@ -1426,11 +1427,20 @@ fb_mmap(struct file *file,
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? The vesafb solution is bad since you'll only end up covering only the first 4MB of the framebuffer instead of the almost 8MB you want. Which in practice will mean throwing away half the VRAM since you really don't want the massive performance hit from accessing it as UC. And that would mean giving up decent display resolutions as well :( And the other option of trying to cover the remainder with multiple ever smaller MTRRs doesn't work either since you'll run out of MTRRs very quickly. This is precisely why I used the hole method in atyfb in the first place. I don't really like the idea of any new mtrr code not supporting that use case, especially as these things tend to be present in older machines where PAT isn't an option. -- Ville Syrjälä syrj...@sci.fi http://www.sci.fi/~syrjala/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? Still waiting for an answer... - if (par-mtrr_aper = 0 !par-aux_start) { - /* Make a hole for mmio. */ - par-mtrr_reg = mtrr_add(par-res_start + 0x80 - -GUI_RESERVE, GUI_RESERVE, -MTRR_TYPE_UNCACHABLE, 1); - if (par-mtrr_reg 0) { - mtrr_del(par-mtrr_aper, 0, 0); - par-mtrr_aper = -1; - } - } } #endif @@ -2776,10 +2765,6 @@ aty_init_exit: par-pll_ops-set_pll(info, par-saved_pll); #ifdef CONFIG_MTRR - if (par-mtrr_reg = 0) { - mtrr_del(par-mtrr_reg, 0, 0); - par-mtrr_reg = -1; - } if (par-mtrr_aper = 0) { mtrr_del(par-mtrr_aper, 0, 0); par-mtrr_aper = -1; @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, } info-fix.mmio_start = raddr; - par-ati_regbase = ioremap(info-fix.mmio_start, 0x1000); + par-ati_regbase = ioremap_nocache(info-fix.mmio_start, 0x1000); if (par-ati_regbase == NULL) return -ENOMEM; @@ -3491,6 +3476,8 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, info-fix.smem_start = addr; info-fix.smem_len = 0x80; + aty_fudge_framebuffer_len(info); + info-screen_base = ioremap(info-fix.smem_start, info-fix.smem_len); if (info-screen_base == NULL) { ret = -ENOMEM; @@ -3563,6 +3550,7 @@ static int atyfb_pci_probe(struct pci_dev *pdev, return -ENOMEM; } par = info-par; + par-bus_type = PCI; info-fix = atyfb_fix; info-device = pdev-dev; par-pci_id = pdev-device; @@ -3732,10 +3720,6 @@ static void atyfb_remove(struct fb_info *info) #endif #ifdef CONFIG_MTRR - if (par-mtrr_reg = 0) { - mtrr_del(par-mtrr_reg, 0, 0); - par-mtrr_reg = -1; - } if (par-mtrr_aper = 0) { mtrr_del(par-mtrr_aper, 0, 0); par-mtrr_aper = -1; -- 2.3.2.209.gd67f9d5.dirty -- To unsubscribe from this list: send the line unsubscribe linux-fbdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Ville Syrjälä syrj...@sci.fi http://www.sci.fi/~syrjala/ -- Ville Syrjälä syrj...@sci.fi http://www.sci.fi/~syrjala/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 20, 2015 at 04:52:18PM -0700, Andy Lutomirski wrote: On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); - if (par-mtrr_aper = 0 !par-aux_start) { - /* Make a hole for mmio. */ - par-mtrr_reg = mtrr_add(par-res_start + 0x80 - -GUI_RESERVE, GUI_RESERVE, -MTRR_TYPE_UNCACHABLE, 1); - if (par-mtrr_reg 0) { - mtrr_del(par-mtrr_aper, 0, 0); - par-mtrr_aper = -1; - } - } } #endif @@ -2776,10 +2765,6 @@ aty_init_exit: par-pll_ops-set_pll(info, par-saved_pll); #ifdef CONFIG_MTRR - if (par-mtrr_reg = 0) { - mtrr_del(par-mtrr_reg, 0, 0); - par-mtrr_reg = -1; - } if (par-mtrr_aper = 0) { mtrr_del(par-mtrr_aper, 0, 0); par-mtrr_aper = -1; @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, } info-fix.mmio_start = raddr; - par-ati_regbase = ioremap(info-fix.mmio_start, 0x1000); + par-ati_regbase = ioremap_nocache(info-fix.mmio_start, 0x1000); Double-check me, but I think that ioremap_nocache + WC MTRR = WC. Precicely, in this case the WC hole was obtained by using MTRR WC. This patch removes that WC hole trick and now we can be explciit about only wanting ioremap_nocache() on the registers, that is WC is not desired here and is not used. The patch does not highlight the fact that there was left in place another ioremap() call for the framebuffer: info-screen_base = ioremap(info-fix.smem_start, info-fix.smem_len); That is the one that later after this patch we use ioremap_wc() for. This patch just removes the hole solution. That's all. Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: --- The last thing we do must do to remain sane is ensure we use the info-fix.smem_start and info-fix.smem_len for the framebuffer MTRR as we know that is always well adjusted. The *one* concern here would be if the MTRR is not in units of 4K __but__ we already know that in the PCI case this cannot happen, in the shared space setting the MTRR would be up to 0x7ff000 and assuming a 4K page: ; 0x7ff000 / 0x1000 2047 Also, internally when MTRR is used mtrr_add() will use mtrr_check() and that should splat a warning when the MTRR base and size are not compatible with what is expected for MTRR usage. --- If any of this is too risky we can use the __arch_phys_wc_add() (or as Andy suggested perhaps use set_page_* stuff, although I am still evaluating this) but I did this change to show the effort required for a change when the registers / framebuffer is on the same PCI BAR but at different offsets. [0] scripts/kernel-doc -man -function mtrr_add_page arch/x86/kernel/cpu/mtrr/main.c | nroff -man | less Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 27, 2015 at 10:37:04AM +0200, Ville Syrjälä wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? Still waiting for an answer... Sorry was in the desert for a bit, I'm back now. Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 27, 2015 at 1:12 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Fri, Mar 20, 2015 at 04:52:18PM -0700, Andy Lutomirski wrote: On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); - if (par-mtrr_aper = 0 !par-aux_start) { - /* Make a hole for mmio. */ - par-mtrr_reg = mtrr_add(par-res_start + 0x80 - -GUI_RESERVE, GUI_RESERVE, -MTRR_TYPE_UNCACHABLE, 1); - if (par-mtrr_reg 0) { - mtrr_del(par-mtrr_aper, 0, 0); - par-mtrr_aper = -1; - } - } } #endif @@ -2776,10 +2765,6 @@ aty_init_exit: par-pll_ops-set_pll(info, par-saved_pll); #ifdef CONFIG_MTRR - if (par-mtrr_reg = 0) { - mtrr_del(par-mtrr_reg, 0, 0); - par-mtrr_reg = -1; - } if (par-mtrr_aper = 0) { mtrr_del(par-mtrr_aper, 0, 0); par-mtrr_aper = -1; @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, } info-fix.mmio_start = raddr; - par-ati_regbase = ioremap(info-fix.mmio_start, 0x1000); + par-ati_regbase = ioremap_nocache(info-fix.mmio_start, 0x1000); Double-check me, but I think that ioremap_nocache + WC MTRR = WC. Precicely, in this case the WC hole was obtained by using MTRR WC. This patch removes that WC hole trick and now we can be explciit about only wanting ioremap_nocache() on the registers, that is WC is not desired here and is not used. The patch does not highlight the fact that there was left in place another ioremap() call for the framebuffer: info-screen_base = ioremap(info-fix.smem_start, info-fix.smem_len); That is the one that later after this patch we use ioremap_wc() for. This patch just removes the hole solution. That's all. I don't understand. If I read it right, there's a 2^n byte BAR. You're requesting WC for the whole think using arch_phys_wc_add. On a PAT system that has no effect and all is well. On a non-PAT system, it adds an MTRR. That means that you need to override the MTRR somehow for the mmio regs, and UC- won't do the trick. Or am I missing something here? --Andy Luis -- Andy Lutomirski AMA Capital Management, LLC ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote: On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? The vesafb solution is bad since you'll only end up covering only the first 4MB of the framebuffer instead of the almost 8MB you want. Which in practice will mean throwing away half the VRAM since you really don't want the massive performance hit from accessing it as UC. And that would mean giving up decent display resolutions as well :( And the other option of trying to cover the remainder with multiple ever smaller MTRRs doesn't work either since you'll run out of MTRRs very quickly. This is precisely why I used the hole method in atyfb in the first place. I don't really like the idea of any new mtrr code not supporting that use case, especially as these things tend to be present in older machines where PAT isn't an option. According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6, non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have an effective memory type of UC. Hence my suggestion to add ioremap_x86_uc and/or set_memory_x86_uc to punch a UC hole in an otherwise WC MTRR-covered region. ioremap_nocache is UC- (even on non-PAT unless I misunderstood how this stuff works), so ioremap_nocache by itself isn't good enough. --Andy ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote: On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote: On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote: On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote: On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote: On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? As per mtrr_add_page() [0] the base and size are just supposed to be in units of 4 KiB, although the practice is to use powers of 2 in *some* drivers this is not standardized and by no means recorded as a requirement. Obviously powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add() will use mtrr_check() to verify the the same requirement. Furthermore, as per my commit log message: Whatever the code may or may not do, the x86 architecture uses power-of-two MTRR sizes. So I'm confused. There should be no confusion, I simply did not know that *was* the requirement for x86, if that is the case we should add a check for that and perhaps generalize a helper that does the power of two helper changes, the cleanest I found was the vesafb driver solution. Thoughts? The vesafb solution is bad since you'll only end up covering only the first 4MB of the framebuffer instead of the almost 8MB you want. Which in practice will mean throwing away half the VRAM since you really don't want the massive performance hit from accessing it as UC. And that would mean giving up decent display resolutions as well :( And the other option of trying to cover the remainder with multiple ever smaller MTRRs doesn't work either since you'll run out of MTRRs very quickly. This is precisely why I used the hole method in atyfb in the first place. I don't really like the idea of any new mtrr code not supporting that use case, especially as these things tend to be present in older machines where PAT isn't an option. According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6, non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have an effective memory type of UC. Hence my suggestion to add ioremap_x86_uc and/or set_memory_x86_uc to punch a UC hole in an otherwise WC MTRR-covered region. OK I think I get it now. And I take it this would hopefully only be used for non-PAT systems? Would there be a use case for PAT systems? I wonder if we can wrap this under some APIs to make it clean and hide this dirty thing behind the scenes, it seems a fragile and error prone and my hope would be that we won't need more specialization in this area for PAT systems. ioremap_nocache is UC- (even on non-PAT unless I misunderstood how this stuff works), so ioremap_nocache by itself isn't good enough. Thanks for the clarification. Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote: diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); MTRRs need power of two size, so how is this supposed to work? - if (par-mtrr_aper = 0 !par-aux_start) { - /* Make a hole for mmio. */ - par-mtrr_reg = mtrr_add(par-res_start + 0x80 - - GUI_RESERVE, GUI_RESERVE, - MTRR_TYPE_UNCACHABLE, 1); - if (par-mtrr_reg 0) { - mtrr_del(par-mtrr_aper, 0, 0); - par-mtrr_aper = -1; - } - } } #endif @@ -2776,10 +2765,6 @@ aty_init_exit: par-pll_ops-set_pll(info, par-saved_pll); #ifdef CONFIG_MTRR - if (par-mtrr_reg = 0) { - mtrr_del(par-mtrr_reg, 0, 0); - par-mtrr_reg = -1; - } if (par-mtrr_aper = 0) { mtrr_del(par-mtrr_aper, 0, 0); par-mtrr_aper = -1; @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, } info-fix.mmio_start = raddr; - par-ati_regbase = ioremap(info-fix.mmio_start, 0x1000); + par-ati_regbase = ioremap_nocache(info-fix.mmio_start, 0x1000); if (par-ati_regbase == NULL) return -ENOMEM; @@ -3491,6 +3476,8 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, info-fix.smem_start = addr; info-fix.smem_len = 0x80; + aty_fudge_framebuffer_len(info); + info-screen_base = ioremap(info-fix.smem_start, info-fix.smem_len); if (info-screen_base == NULL) { ret = -ENOMEM; @@ -3563,6 +3550,7 @@ static int atyfb_pci_probe(struct pci_dev *pdev, return -ENOMEM; } par = info-par; + par-bus_type = PCI; info-fix = atyfb_fix; info-device = pdev-dev; par-pci_id = pdev-device; @@ -3732,10 +3720,6 @@ static void atyfb_remove(struct fb_info *info) #endif #ifdef CONFIG_MTRR - if (par-mtrr_reg = 0) { - mtrr_del(par-mtrr_reg, 0, 0); - par-mtrr_reg = -1; - } if (par-mtrr_aper = 0) { mtrr_del(par-mtrr_aper, 0, 0); par-mtrr_aper = -1; -- 2.3.2.209.gd67f9d5.dirty -- To unsubscribe from this list: send the line unsubscribe linux-fbdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Ville Syrjälä syrj...@sci.fi http://www.sci.fi/~syrjala/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around
On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com The atyfb driver uses an MTRR work around since some cards use the same PCI BAR for the framebuffer and MMIO. In such cards the last page is used for MMIO, the rest for the framebuffer, so on those cards we ioremap() the MMIO page alone, then again ioremap() the full framebuffer including the MMIO space *and* ___then___ use an MTRR with MTRR_TYPE_WRCOMB on the full PCI BAR... and finally hole in an MTRR_TYPE_UNCACHABLE MTRR only for MMIO. This is a terrible fucking work around, and should by no means be necessary however evidence through a large series of conversion of drivers to ioremap_wc() for the framebuffer shows that around the time MTRR started becoming popular devices did not have things lined up for easily separating the framebuffer and MMIO register access. In some cases a driver requires significant intrusive changes in order to make the split for an ioremap() for MMIO registers and another ioremap_wc() for the framebuffer, at other times a bit of careful study of the driver suffices. This example driver falls into the later category. We can replace the MTRR MTRR_TYPE_UNCACHABLE work around by using ioremap_nocache(), the length of the MMIO space should already be correct. The other part we need to correct is ensuring we ioremap() for the framebuffer only the required size. Since the ioremap() happens early on probe for PCI devices before aty_init() where we typically adjust the length and know how to do it, we can fix this by pegging the bus type as PCI on PCI probe, and finally fudging and framebuffer length just as we do on aty_init(). The last thing we do must do to remain sane is ensure we use the info-fix.smem_start and info-fix.smem_len for the framebuffer MTRR as we know that is always well adjusted. The *one* concern here would be if the MTRR is not in units of 4K __but__ we already know that in the PCI case this cannot happen, in the shared space setting the MTRR would be up to 0x7ff000 and assuming a 4K page: ; 0x7ff000 / 0x1000 2047 Also, internally when MTRR is used mtrr_add() will use mtrr_check() and that should splat a warning when the MTRR base and size are not compatible with what is expected for MTRR usage. This fix lets us nuke the MTRR_TYPE_UNCACHABLE MTRR hole. Cc: Suresh Siddha suresh.b.sid...@intel.com Cc: Venkatesh Pallipadi venkatesh.pallip...@intel.com Cc: Ingo Molnar mi...@elte.hu Cc: Linus Torvalds torva...@linux-foundation.org Cc: Thomas Gleixner t...@linutronix.de Cc: Juergen Gross jgr...@suse.com Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Andy Lutomirski l...@amacapital.net Cc: Dave Airlie airl...@redhat.com Cc: Antonino Daplas adap...@gmail.com Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: linux-fb...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- drivers/video/fbdev/aty/atyfb.h | 1 - drivers/video/fbdev/aty/atyfb_base.c | 28 ++-- 2 files changed, 6 insertions(+), 23 deletions(-) diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h index 1f39a62..89ec439 100644 --- a/drivers/video/fbdev/aty/atyfb.h +++ b/drivers/video/fbdev/aty/atyfb.h @@ -184,7 +184,6 @@ struct atyfb_par { spinlock_t int_lock; #ifdef CONFIG_MTRR int mtrr_aper; - int mtrr_reg; #endif u32 mem_cntl; struct crtc saved_crtc; diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 8025624..8875e56 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info) #ifdef CONFIG_MTRR par-mtrr_aper = -1; - par-mtrr_reg = -1; if (!nomtrr) { - /* Cover the whole resource. */ - par-mtrr_aper = mtrr_add(par-res_start, par-res_size, + par-mtrr_aper = mtrr_add(info-fix.smem_start, + info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); - if (par-mtrr_aper = 0 !par-aux_start) { - /* Make a hole for mmio. */ - par-mtrr_reg = mtrr_add(par-res_start + 0x80 - -GUI_RESERVE, GUI_RESERVE, -MTRR_TYPE_UNCACHABLE, 1); - if (par-mtrr_reg 0) { - mtrr_del(par-mtrr_aper, 0, 0); - par-mtrr_aper = -1; - } - } } #endif @@ -2776,10 +2765,6 @@ aty_init_exit: par-pll_ops-set_pll(info, par-saved_pll); #ifdef CONFIG_MTRR -