Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-04-02 Thread Andy Lutomirski
On Thu, Apr 2, 2015 at 12:45 PM, Luis R. Rodriguez mcg...@suse.com wrote:
 On Wed, Apr 01, 2015 at 05:04:08PM -0700, Andy Lutomirski wrote:
 On Wed, Apr 1, 2015 at 4:52 PM, Luis R. Rodriguez mcg...@suse.com wrote:
  On Sat, Mar 28, 2015 at 02:23:34PM +0200, Ville Syrjälä wrote:
  On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote:
   On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote:
On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote:
 On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
 On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
  On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez 
  mcg...@suse.com wrote:
   On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
   On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez 
   wrote:
diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
b/drivers/video/fbdev/aty/atyfb_base.c
index 8025624..8875e56 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info 
*info)
   
 #ifdef CONFIG_MTRR
par-mtrr_aper = -1;
-   par-mtrr_reg = -1;
if (!nomtrr) {
-   /* Cover the whole resource. */
-   par-mtrr_aper = mtrr_add(par-res_start, 
par-res_size,
+   par-mtrr_aper = mtrr_add(info-fix.smem_start,
+ info-fix.smem_len,
  MTRR_TYPE_WRCOMB, 1);
  
   MTRRs need power of two size, so how is this supposed to work?
  
   As per mtrr_add_page() [0] the base and size are just supposed 
   to be in units
   of 4 KiB, although the practice is to use powers of 2 in 
   *some* drivers this
   is not standardized and by no means recorded as a requirement. 
   Obviously
   powers of 2 will work too and you'd end up neatly aligned as 
   well. mtrr_add()
   will use mtrr_check() to verify the the same requirement. 
   Furthermore,
   as per my commit log message:
 
  Whatever the code may or may not do, the x86 architecture uses
  power-of-two MTRR sizes.  So I'm confused.

 There should be no confusion, I simply did not know that *was* the
 requirement for x86, if that is the case we should add a check for 
 that
 and perhaps generalize a helper that does the power of two helper 
 changes,
 the cleanest I found was the vesafb driver solution.

 Thoughts?

 The vesafb solution is bad since you'll only end up covering only
 the first 4MB of the framebuffer instead of the almost 8MB you want.
 Which in practice will mean throwing away half the VRAM since you 
 really
 don't want the massive performance hit from accessing it as UC. And 
 that
 would mean giving up decent display resolutions as well :(

 And the other option of trying to cover the remainder with multiple 
 ever
 smaller MTRRs doesn't work either since you'll run out of MTRRs very
 quickly.

 This is precisely why I used the hole method in atyfb in the first
 place.

 I don't really like the idea of any new mtrr code not supporting 
 that
 use case, especially as these things tend to be present in older 
 machines
 where PAT isn't an option.
   
According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
an effective memory type of UC.
 
  This is true but non-PAT systems that use just ioremap() will default to
  _PAGE_CACHE_MODE_UC_MINUS, not _PAGE_CACHE_MODE_UC, and 
  _PAGE_CACHE_MODE_UC_MINUS
  on Linux has PCD = 1, PWT = 0. The list comes from:
 
  uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {
  [_PAGE_CACHE_MODE_WB  ] = 0 | 0,
  [_PAGE_CACHE_MODE_WC  ] = _PAGE_PWT | 0,
  [_PAGE_CACHE_MODE_UC_MINUS] = 0 | _PAGE_PCD,
  [_PAGE_CACHE_MODE_UC  ] = _PAGE_PWT | _PAGE_PCD,
  [_PAGE_CACHE_MODE_WT  ] = 0 | _PAGE_PCD,
  [_PAGE_CACHE_MODE_WP  ] = 0 | _PAGE_PCD,
  };
 
  This can better be read here:
 
   PAT
   |PCD
   ||PWT
   |||
   000 WB  _PAGE_CACHE_MODE_WB
   001 WC  _PAGE_CACHE_MODE_WC
   010 UC- _PAGE_CACHE_MODE_UC_MINUS
   011 UC  _PAGE_CACHE_MODE_UC
 
  On x86 ioremap() defaults to ioremap_nocache() and right now that uses
  _PAGE_CACHE_MODE_UC_MINUS not _PAGE_CACHE_MODE_UC. We have two cases
  to consider for non-PAT systems then:
 
  a) Right now as ioremap() and ioremap_nocache() default to 
  _PAGE_CACHE_MODE_UC_MINUS
 on x86. In this case using a WC MTRR seems to use PWT=0, PCD=1, and
 table table 

Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-04-02 Thread Luis R. Rodriguez
On Wed, Apr 01, 2015 at 05:04:08PM -0700, Andy Lutomirski wrote:
 On Wed, Apr 1, 2015 at 4:52 PM, Luis R. Rodriguez mcg...@suse.com wrote:
  On Sat, Mar 28, 2015 at 02:23:34PM +0200, Ville Syrjälä wrote:
  On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote:
   On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote:
On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote:
 On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
 On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
  On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez 
  mcg...@suse.com wrote:
   On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
   On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez 
   wrote:
diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
b/drivers/video/fbdev/aty/atyfb_base.c
index 8025624..8875e56 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info 
*info)
   
 #ifdef CONFIG_MTRR
par-mtrr_aper = -1;
-   par-mtrr_reg = -1;
if (!nomtrr) {
-   /* Cover the whole resource. */
-   par-mtrr_aper = mtrr_add(par-res_start, 
par-res_size,
+   par-mtrr_aper = mtrr_add(info-fix.smem_start,
+ info-fix.smem_len,
  MTRR_TYPE_WRCOMB, 1);
  
   MTRRs need power of two size, so how is this supposed to work?
  
   As per mtrr_add_page() [0] the base and size are just supposed 
   to be in units
   of 4 KiB, although the practice is to use powers of 2 in *some* 
   drivers this
   is not standardized and by no means recorded as a requirement. 
   Obviously
   powers of 2 will work too and you'd end up neatly aligned as 
   well. mtrr_add()
   will use mtrr_check() to verify the the same requirement. 
   Furthermore,
   as per my commit log message:
 
  Whatever the code may or may not do, the x86 architecture uses
  power-of-two MTRR sizes.  So I'm confused.

 There should be no confusion, I simply did not know that *was* the
 requirement for x86, if that is the case we should add a check for 
 that
 and perhaps generalize a helper that does the power of two helper 
 changes,
 the cleanest I found was the vesafb driver solution.

 Thoughts?

 The vesafb solution is bad since you'll only end up covering only
 the first 4MB of the framebuffer instead of the almost 8MB you want.
 Which in practice will mean throwing away half the VRAM since you 
 really
 don't want the massive performance hit from accessing it as UC. And 
 that
 would mean giving up decent display resolutions as well :(

 And the other option of trying to cover the remainder with multiple 
 ever
 smaller MTRRs doesn't work either since you'll run out of MTRRs very
 quickly.

 This is precisely why I used the hole method in atyfb in the first
 place.

 I don't really like the idea of any new mtrr code not supporting that
 use case, especially as these things tend to be present in older 
 machines
 where PAT isn't an option.
   
According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
an effective memory type of UC.
 
  This is true but non-PAT systems that use just ioremap() will default to
  _PAGE_CACHE_MODE_UC_MINUS, not _PAGE_CACHE_MODE_UC, and 
  _PAGE_CACHE_MODE_UC_MINUS
  on Linux has PCD = 1, PWT = 0. The list comes from:
 
  uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {
  [_PAGE_CACHE_MODE_WB  ] = 0 | 0,
  [_PAGE_CACHE_MODE_WC  ] = _PAGE_PWT | 0,
  [_PAGE_CACHE_MODE_UC_MINUS] = 0 | _PAGE_PCD,
  [_PAGE_CACHE_MODE_UC  ] = _PAGE_PWT | _PAGE_PCD,
  [_PAGE_CACHE_MODE_WT  ] = 0 | _PAGE_PCD,
  [_PAGE_CACHE_MODE_WP  ] = 0 | _PAGE_PCD,
  };
 
  This can better be read here:
 
   PAT
   |PCD
   ||PWT
   |||
   000 WB  _PAGE_CACHE_MODE_WB
   001 WC  _PAGE_CACHE_MODE_WC
   010 UC- _PAGE_CACHE_MODE_UC_MINUS
   011 UC  _PAGE_CACHE_MODE_UC
 
  On x86 ioremap() defaults to ioremap_nocache() and right now that uses
  _PAGE_CACHE_MODE_UC_MINUS not _PAGE_CACHE_MODE_UC. We have two cases
  to consider for non-PAT systems then:
 
  a) Right now as ioremap() and ioremap_nocache() default to 
  _PAGE_CACHE_MODE_UC_MINUS
 on x86. In this case using a WC MTRR seems to use PWT=0, PCD=1, and
 table table 11-6 on non-PAT systems seems to place this situation as
 implementation 

Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-04-01 Thread Luis R. Rodriguez
On Sat, Mar 28, 2015 at 02:23:34PM +0200, Ville Syrjälä wrote:
 On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote:
  On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote:
   On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote:
On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
 On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez 
 mcg...@suse.com wrote:
  On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
  On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
   diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
   b/drivers/video/fbdev/aty/atyfb_base.c
   index 8025624..8875e56 100644
   --- a/drivers/video/fbdev/aty/atyfb_base.c
   +++ b/drivers/video/fbdev/aty/atyfb_base.c
   @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info 
   *info)
  
#ifdef CONFIG_MTRR
   par-mtrr_aper = -1;
   -   par-mtrr_reg = -1;
   if (!nomtrr) {
   -   /* Cover the whole resource. */
   -   par-mtrr_aper = mtrr_add(par-res_start, 
   par-res_size,
   +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
   + info-fix.smem_len,
 MTRR_TYPE_WRCOMB, 1);
 
  MTRRs need power of two size, so how is this supposed to work?
 
  As per mtrr_add_page() [0] the base and size are just supposed to 
  be in units
  of 4 KiB, although the practice is to use powers of 2 in *some* 
  drivers this
  is not standardized and by no means recorded as a requirement. 
  Obviously
  powers of 2 will work too and you'd end up neatly aligned as well. 
  mtrr_add()
  will use mtrr_check() to verify the the same requirement. 
  Furthermore,
  as per my commit log message:

 Whatever the code may or may not do, the x86 architecture uses
 power-of-two MTRR sizes.  So I'm confused.
   
There should be no confusion, I simply did not know that *was* the
requirement for x86, if that is the case we should add a check for that
and perhaps generalize a helper that does the power of two helper 
changes,
the cleanest I found was the vesafb driver solution.
   
Thoughts?
   
The vesafb solution is bad since you'll only end up covering only
the first 4MB of the framebuffer instead of the almost 8MB you want.
Which in practice will mean throwing away half the VRAM since you really
don't want the massive performance hit from accessing it as UC. And that
would mean giving up decent display resolutions as well :(
   
And the other option of trying to cover the remainder with multiple ever
smaller MTRRs doesn't work either since you'll run out of MTRRs very
quickly.
   
This is precisely why I used the hole method in atyfb in the first
place.
   
I don't really like the idea of any new mtrr code not supporting that
use case, especially as these things tend to be present in older 
machines
where PAT isn't an option.
   
   According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
   non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
   an effective memory type of UC.

This is true but non-PAT systems that use just ioremap() will default to
_PAGE_CACHE_MODE_UC_MINUS, not _PAGE_CACHE_MODE_UC, and 
_PAGE_CACHE_MODE_UC_MINUS
on Linux has PCD = 1, PWT = 0. The list comes from:

uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {  
[_PAGE_CACHE_MODE_WB  ] = 0 | 0,
[_PAGE_CACHE_MODE_WC  ] = _PAGE_PWT | 0,
[_PAGE_CACHE_MODE_UC_MINUS] = 0 | _PAGE_PCD,
[_PAGE_CACHE_MODE_UC  ] = _PAGE_PWT | _PAGE_PCD,
[_PAGE_CACHE_MODE_WT  ] = 0 | _PAGE_PCD,
[_PAGE_CACHE_MODE_WP  ] = 0 | _PAGE_PCD,
};  

This can better be read here:

 PAT
 |PCD
 ||PWT
 |||
 000 WB  _PAGE_CACHE_MODE_WB
 001 WC  _PAGE_CACHE_MODE_WC
 010 UC- _PAGE_CACHE_MODE_UC_MINUS
 011 UC  _PAGE_CACHE_MODE_UC

On x86 ioremap() defaults to ioremap_nocache() and right now that uses
_PAGE_CACHE_MODE_UC_MINUS not _PAGE_CACHE_MODE_UC. We have two cases
to consider for non-PAT systems then:

a) Right now as ioremap() and ioremap_nocache() default to 
_PAGE_CACHE_MODE_UC_MINUS
   on x86. In this case using a WC MTRR seems to use PWT=0, PCD=1, and
   table table 11-6 on non-PAT systems seems to place this situation as
   implementation defined and not encouraged.

a) when commit de33c442e x86 PAT: fix performance drop for glx, use
   UC minus for 

Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-04-01 Thread Andy Lutomirski
On Wed, Apr 1, 2015 at 4:52 PM, Luis R. Rodriguez mcg...@suse.com wrote:
 On Sat, Mar 28, 2015 at 02:23:34PM +0200, Ville Syrjälä wrote:
 On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote:
  On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote:
   On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote:
On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
 On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez 
 mcg...@suse.com wrote:
  On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
  On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez 
  wrote:
   diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
   b/drivers/video/fbdev/aty/atyfb_base.c
   index 8025624..8875e56 100644
   --- a/drivers/video/fbdev/aty/atyfb_base.c
   +++ b/drivers/video/fbdev/aty/atyfb_base.c
   @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info 
   *info)
  
#ifdef CONFIG_MTRR
   par-mtrr_aper = -1;
   -   par-mtrr_reg = -1;
   if (!nomtrr) {
   -   /* Cover the whole resource. */
   -   par-mtrr_aper = mtrr_add(par-res_start, 
   par-res_size,
   +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
   + info-fix.smem_len,
 MTRR_TYPE_WRCOMB, 1);
 
  MTRRs need power of two size, so how is this supposed to work?
 
  As per mtrr_add_page() [0] the base and size are just supposed to 
  be in units
  of 4 KiB, although the practice is to use powers of 2 in *some* 
  drivers this
  is not standardized and by no means recorded as a requirement. 
  Obviously
  powers of 2 will work too and you'd end up neatly aligned as 
  well. mtrr_add()
  will use mtrr_check() to verify the the same requirement. 
  Furthermore,
  as per my commit log message:

 Whatever the code may or may not do, the x86 architecture uses
 power-of-two MTRR sizes.  So I'm confused.
   
There should be no confusion, I simply did not know that *was* the
requirement for x86, if that is the case we should add a check for 
that
and perhaps generalize a helper that does the power of two helper 
changes,
the cleanest I found was the vesafb driver solution.
   
Thoughts?
   
The vesafb solution is bad since you'll only end up covering only
the first 4MB of the framebuffer instead of the almost 8MB you want.
Which in practice will mean throwing away half the VRAM since you 
really
don't want the massive performance hit from accessing it as UC. And 
that
would mean giving up decent display resolutions as well :(
   
And the other option of trying to cover the remainder with multiple 
ever
smaller MTRRs doesn't work either since you'll run out of MTRRs very
quickly.
   
This is precisely why I used the hole method in atyfb in the first
place.
   
I don't really like the idea of any new mtrr code not supporting that
use case, especially as these things tend to be present in older 
machines
where PAT isn't an option.
  
   According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
   non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
   an effective memory type of UC.

 This is true but non-PAT systems that use just ioremap() will default to
 _PAGE_CACHE_MODE_UC_MINUS, not _PAGE_CACHE_MODE_UC, and 
 _PAGE_CACHE_MODE_UC_MINUS
 on Linux has PCD = 1, PWT = 0. The list comes from:

 uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {
 [_PAGE_CACHE_MODE_WB  ] = 0 | 0,
 [_PAGE_CACHE_MODE_WC  ] = _PAGE_PWT | 0,
 [_PAGE_CACHE_MODE_UC_MINUS] = 0 | _PAGE_PCD,
 [_PAGE_CACHE_MODE_UC  ] = _PAGE_PWT | _PAGE_PCD,
 [_PAGE_CACHE_MODE_WT  ] = 0 | _PAGE_PCD,
 [_PAGE_CACHE_MODE_WP  ] = 0 | _PAGE_PCD,
 };

 This can better be read here:

  PAT
  |PCD
  ||PWT
  |||
  000 WB  _PAGE_CACHE_MODE_WB
  001 WC  _PAGE_CACHE_MODE_WC
  010 UC- _PAGE_CACHE_MODE_UC_MINUS
  011 UC  _PAGE_CACHE_MODE_UC

 On x86 ioremap() defaults to ioremap_nocache() and right now that uses
 _PAGE_CACHE_MODE_UC_MINUS not _PAGE_CACHE_MODE_UC. We have two cases
 to consider for non-PAT systems then:

 a) Right now as ioremap() and ioremap_nocache() default to 
 _PAGE_CACHE_MODE_UC_MINUS
on x86. In this case using a WC MTRR seems to use PWT=0, PCD=1, and
table table 11-6 on non-PAT systems seems to place this situation as
implementation defined and not encouraged.

 a) when commit de33c442e x86 PAT: fix performance drop for glx, use
UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
gets 

Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-28 Thread Ville Syrjälä
On Sat, Mar 28, 2015 at 01:28:18AM +0100, Luis R. Rodriguez wrote:
 On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote:
  On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote:
   On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
   On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com 
wrote:
 On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
 On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
  diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
  b/drivers/video/fbdev/aty/atyfb_base.c
  index 8025624..8875e56 100644
  --- a/drivers/video/fbdev/aty/atyfb_base.c
  +++ b/drivers/video/fbdev/aty/atyfb_base.c
  @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
 
   #ifdef CONFIG_MTRR
  par-mtrr_aper = -1;
  -   par-mtrr_reg = -1;
  if (!nomtrr) {
  -   /* Cover the whole resource. */
  -   par-mtrr_aper = mtrr_add(par-res_start, 
  par-res_size,
  +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
  + info-fix.smem_len,
MTRR_TYPE_WRCOMB, 1);

 MTRRs need power of two size, so how is this supposed to work?

 As per mtrr_add_page() [0] the base and size are just supposed to be 
 in units
 of 4 KiB, although the practice is to use powers of 2 in *some* 
 drivers this
 is not standardized and by no means recorded as a requirement. 
 Obviously
 powers of 2 will work too and you'd end up neatly aligned as well. 
 mtrr_add()
 will use mtrr_check() to verify the the same requirement. 
 Furthermore,
 as per my commit log message:
   
Whatever the code may or may not do, the x86 architecture uses
power-of-two MTRR sizes.  So I'm confused.
  
   There should be no confusion, I simply did not know that *was* the
   requirement for x86, if that is the case we should add a check for that
   and perhaps generalize a helper that does the power of two helper 
   changes,
   the cleanest I found was the vesafb driver solution.
  
   Thoughts?
  
   The vesafb solution is bad since you'll only end up covering only
   the first 4MB of the framebuffer instead of the almost 8MB you want.
   Which in practice will mean throwing away half the VRAM since you really
   don't want the massive performance hit from accessing it as UC. And that
   would mean giving up decent display resolutions as well :(
  
   And the other option of trying to cover the remainder with multiple ever
   smaller MTRRs doesn't work either since you'll run out of MTRRs very
   quickly.
  
   This is precisely why I used the hole method in atyfb in the first
   place.
  
   I don't really like the idea of any new mtrr code not supporting that
   use case, especially as these things tend to be present in older machines
   where PAT isn't an option.
  
  According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
  non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
  an effective memory type of UC.  Hence my suggestion to add
  ioremap_x86_uc and/or set_memory_x86_uc to punch a UC hole in an
  otherwise WC MTRR-covered region.
 
 OK I think I get it now.
 
 And I take it this would hopefully only be used for non-PAT systems?
 Would there be a use case for PAT systems? I wonder if we can wrap
 this under some APIs to make it clean and hide this dirty thing
 behind the scenes, it seems a fragile and error prone and my hope
 would be that we won't need more specialization in this area for
 PAT systems.

One potential complication is kernel vs. userspace mmap. MTRR applies to
the physical address, but PAT applies to the virtual address, so with
the WC MTRR you get WC for userspace for free as well. Also the
userspace mmaps request will have the length of smem_len (at most), so
it won't be the nice power of two in that case.

Also on PAT systems w/o a BIOS provided WC MTRR, the fbdev mmap seems
to be total crap at the moment. IIRC I have a patch to fix things a bit...

From 4e6d70d223f35953c8a11a58cf3376a8a001fa4f Mon Sep 17 00:00:00 2001
From: Ville Syrjala syrj...@sci.fi
Date: Fri, 15 Apr 2011 04:02:43 +0300
Subject: [PATCH] fb: writecombine fb

---
 drivers/video/fbdev/core/fbmem.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 0705d88..ecbde0e 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -1396,6 +1396,7 @@ fb_mmap(struct file *file, struct vm_area_struct * vma)
unsigned long mmio_pgoff;
unsigned long start;
u32 len;
+   bool mmio = false;
 
if (!info)
return -ENODEV;
@@ -1426,11 +1427,20 @@ fb_mmap(struct file *file, 

Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-28 Thread Ville Syrjälä
On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
 On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
  On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote:
   On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
   On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
b/drivers/video/fbdev/aty/atyfb_base.c
index 8025624..8875e56 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
   
 #ifdef CONFIG_MTRR
par-mtrr_aper = -1;
-   par-mtrr_reg = -1;
if (!nomtrr) {
-   /* Cover the whole resource. */
-   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
+   par-mtrr_aper = mtrr_add(info-fix.smem_start,
+ info-fix.smem_len,
  MTRR_TYPE_WRCOMB, 1);
  
   MTRRs need power of two size, so how is this supposed to work?
  
   As per mtrr_add_page() [0] the base and size are just supposed to be in 
   units
   of 4 KiB, although the practice is to use powers of 2 in *some* drivers 
   this
   is not standardized and by no means recorded as a requirement. Obviously
   powers of 2 will work too and you'd end up neatly aligned as well. 
   mtrr_add()
   will use mtrr_check() to verify the the same requirement. Furthermore,
   as per my commit log message:
  
  Whatever the code may or may not do, the x86 architecture uses
  power-of-two MTRR sizes.  So I'm confused.
 
 There should be no confusion, I simply did not know that *was* the
 requirement for x86, if that is the case we should add a check for that
 and perhaps generalize a helper that does the power of two helper changes,
 the cleanest I found was the vesafb driver solution.
 
 Thoughts?

The vesafb solution is bad since you'll only end up covering only
the first 4MB of the framebuffer instead of the almost 8MB you want.
Which in practice will mean throwing away half the VRAM since you really
don't want the massive performance hit from accessing it as UC. And that
would mean giving up decent display resolutions as well :(

And the other option of trying to cover the remainder with multiple ever
smaller MTRRs doesn't work either since you'll run out of MTRRs very
quickly.

This is precisely why I used the hole method in atyfb in the first
place.

I don't really like the idea of any new mtrr code not supporting that
use case, especially as these things tend to be present in older machines
where PAT isn't an option.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-27 Thread Ville Syrjälä
On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
 On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
  diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
  b/drivers/video/fbdev/aty/atyfb_base.c
  index 8025624..8875e56 100644
  --- a/drivers/video/fbdev/aty/atyfb_base.c
  +++ b/drivers/video/fbdev/aty/atyfb_base.c
  @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
   
   #ifdef CONFIG_MTRR
  par-mtrr_aper = -1;
  -   par-mtrr_reg = -1;
  if (!nomtrr) {
  -   /* Cover the whole resource. */
  -   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
  +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
  + info-fix.smem_len,
MTRR_TYPE_WRCOMB, 1);
 
 MTRRs need power of two size, so how is this supposed to work?

Still waiting for an answer...

 
  -   if (par-mtrr_aper = 0  !par-aux_start) {
  -   /* Make a hole for mmio. */
  -   par-mtrr_reg = mtrr_add(par-res_start + 0x80 -
  -GUI_RESERVE, GUI_RESERVE,
  -MTRR_TYPE_UNCACHABLE, 1);
  -   if (par-mtrr_reg  0) {
  -   mtrr_del(par-mtrr_aper, 0, 0);
  -   par-mtrr_aper = -1;
  -   }
  -   }
  }
   #endif
   
  @@ -2776,10 +2765,6 @@ aty_init_exit:
  par-pll_ops-set_pll(info, par-saved_pll);
   
   #ifdef CONFIG_MTRR
  -   if (par-mtrr_reg = 0) {
  -   mtrr_del(par-mtrr_reg, 0, 0);
  -   par-mtrr_reg = -1;
  -   }
  if (par-mtrr_aper = 0) {
  mtrr_del(par-mtrr_aper, 0, 0);
  par-mtrr_aper = -1;
  @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, 
  struct fb_info *info,
  }
   
  info-fix.mmio_start = raddr;
  -   par-ati_regbase = ioremap(info-fix.mmio_start, 0x1000);
  +   par-ati_regbase = ioremap_nocache(info-fix.mmio_start, 0x1000);
  if (par-ati_regbase == NULL)
  return -ENOMEM;
   
  @@ -3491,6 +3476,8 @@ static int atyfb_setup_generic(struct pci_dev *pdev, 
  struct fb_info *info,
  info-fix.smem_start = addr;
  info-fix.smem_len = 0x80;
   
  +   aty_fudge_framebuffer_len(info);
  +
  info-screen_base = ioremap(info-fix.smem_start, info-fix.smem_len);
  if (info-screen_base == NULL) {
  ret = -ENOMEM;
  @@ -3563,6 +3550,7 @@ static int atyfb_pci_probe(struct pci_dev *pdev,
  return -ENOMEM;
  }
  par = info-par;
  +   par-bus_type = PCI;
  info-fix = atyfb_fix;
  info-device = pdev-dev;
  par-pci_id = pdev-device;
  @@ -3732,10 +3720,6 @@ static void atyfb_remove(struct fb_info *info)
   #endif
   
   #ifdef CONFIG_MTRR
  -   if (par-mtrr_reg = 0) {
  -   mtrr_del(par-mtrr_reg, 0, 0);
  -   par-mtrr_reg = -1;
  -   }
  if (par-mtrr_aper = 0) {
  mtrr_del(par-mtrr_aper, 0, 0);
  par-mtrr_aper = -1;
  -- 
  2.3.2.209.gd67f9d5.dirty
  
  --
  To unsubscribe from this list: send the line unsubscribe linux-fbdev in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 -- 
 Ville Syrjälä
 syrj...@sci.fi
 http://www.sci.fi/~syrjala/

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-27 Thread Luis R. Rodriguez
On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
 On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com wrote:
  On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
  On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
   diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
   b/drivers/video/fbdev/aty/atyfb_base.c
   index 8025624..8875e56 100644
   --- a/drivers/video/fbdev/aty/atyfb_base.c
   +++ b/drivers/video/fbdev/aty/atyfb_base.c
   @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
  
#ifdef CONFIG_MTRR
   par-mtrr_aper = -1;
   -   par-mtrr_reg = -1;
   if (!nomtrr) {
   -   /* Cover the whole resource. */
   -   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
   +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
   + info-fix.smem_len,
 MTRR_TYPE_WRCOMB, 1);
 
  MTRRs need power of two size, so how is this supposed to work?
 
  As per mtrr_add_page() [0] the base and size are just supposed to be in 
  units
  of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
  is not standardized and by no means recorded as a requirement. Obviously
  powers of 2 will work too and you'd end up neatly aligned as well. 
  mtrr_add()
  will use mtrr_check() to verify the the same requirement. Furthermore,
  as per my commit log message:
 
 Whatever the code may or may not do, the x86 architecture uses
 power-of-two MTRR sizes.  So I'm confused.

There should be no confusion, I simply did not know that *was* the
requirement for x86, if that is the case we should add a check for that
and perhaps generalize a helper that does the power of two helper changes,
the cleanest I found was the vesafb driver solution.

Thoughts?

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-27 Thread Luis R. Rodriguez
On Fri, Mar 20, 2015 at 04:52:18PM -0700, Andy Lutomirski wrote:
 On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
 mcg...@do-not-panic.com wrote:
  diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
  b/drivers/video/fbdev/aty/atyfb_base.c
  index 8025624..8875e56 100644
  --- a/drivers/video/fbdev/aty/atyfb_base.c
  +++ b/drivers/video/fbdev/aty/atyfb_base.c
  @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
 
   #ifdef CONFIG_MTRR
  par-mtrr_aper = -1;
  -   par-mtrr_reg = -1;
  if (!nomtrr) {
  -   /* Cover the whole resource. */
  -   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
  +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
  + info-fix.smem_len,
MTRR_TYPE_WRCOMB, 1);
  -   if (par-mtrr_aper = 0  !par-aux_start) {
  -   /* Make a hole for mmio. */
  -   par-mtrr_reg = mtrr_add(par-res_start + 0x80 -
  -GUI_RESERVE, GUI_RESERVE,
  -MTRR_TYPE_UNCACHABLE, 1);
  -   if (par-mtrr_reg  0) {
  -   mtrr_del(par-mtrr_aper, 0, 0);
  -   par-mtrr_aper = -1;
  -   }
  -   }
  }
   #endif
 
  @@ -2776,10 +2765,6 @@ aty_init_exit:
  par-pll_ops-set_pll(info, par-saved_pll);
 
   #ifdef CONFIG_MTRR
  -   if (par-mtrr_reg = 0) {
  -   mtrr_del(par-mtrr_reg, 0, 0);
  -   par-mtrr_reg = -1;
  -   }
  if (par-mtrr_aper = 0) {
  mtrr_del(par-mtrr_aper, 0, 0);
  par-mtrr_aper = -1;
  @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, 
  struct fb_info *info,
  }
 
  info-fix.mmio_start = raddr;
  -   par-ati_regbase = ioremap(info-fix.mmio_start, 0x1000);
  +   par-ati_regbase = ioremap_nocache(info-fix.mmio_start, 0x1000);
 
 Double-check me, but I think that ioremap_nocache + WC MTRR = WC. 

Precicely, in this case the WC hole was obtained by using MTRR WC. This
patch removes that WC hole trick and now we can be explciit about
only wanting ioremap_nocache() on the registers, that is WC is not
desired here and is not used. The patch does not highlight the fact
that there was left in place another ioremap() call for the framebuffer:

info-screen_base = ioremap(info-fix.smem_start, info-fix.smem_len);

That is the one that later after this patch we use ioremap_wc() for.
This patch just removes the hole solution. That's all.

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-27 Thread Luis R. Rodriguez
On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
 On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
  diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
  b/drivers/video/fbdev/aty/atyfb_base.c
  index 8025624..8875e56 100644
  --- a/drivers/video/fbdev/aty/atyfb_base.c
  +++ b/drivers/video/fbdev/aty/atyfb_base.c
  @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
   
   #ifdef CONFIG_MTRR
  par-mtrr_aper = -1;
  -   par-mtrr_reg = -1;
  if (!nomtrr) {
  -   /* Cover the whole resource. */
  -   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
  +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
  + info-fix.smem_len,
MTRR_TYPE_WRCOMB, 1);
 
 MTRRs need power of two size, so how is this supposed to work?

As per mtrr_add_page() [0] the base and size are just supposed to be in units
of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
is not standardized and by no means recorded as a requirement. Obviously
powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add()
will use mtrr_check() to verify the the same requirement. Furthermore,
as per my commit log message:

---
The last thing we do must do to remain sane is ensure we
use the info-fix.smem_start and info-fix.smem_len for
the framebuffer MTRR as we know that is always well adjusted.
The *one* concern here would be if the MTRR is not in units
of 4K __but__ we already know that in the PCI case this cannot
happen, in the shared space setting the MTRR would be up to
0x7ff000 and assuming a 4K page:

; 0x7ff000 / 0x1000
2047

Also, internally when MTRR is used mtrr_add() will use mtrr_check()
and that should splat a warning when the MTRR base and size are
not compatible with what is expected for MTRR usage.
---

If any of this is too risky we can use the __arch_phys_wc_add() (or as
Andy suggested perhaps use set_page_* stuff, although I am still evaluating
this) but I did this change to show the effort required for a change when
the registers / framebuffer is on the same PCI BAR but at different offsets.

[0] scripts/kernel-doc -man -function mtrr_add_page 
arch/x86/kernel/cpu/mtrr/main.c | nroff -man | less

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-27 Thread Luis R. Rodriguez
On Fri, Mar 27, 2015 at 10:37:04AM +0200, Ville Syrjälä wrote:
 On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
  On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
   diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
   b/drivers/video/fbdev/aty/atyfb_base.c
   index 8025624..8875e56 100644
   --- a/drivers/video/fbdev/aty/atyfb_base.c
   +++ b/drivers/video/fbdev/aty/atyfb_base.c
   @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)

#ifdef CONFIG_MTRR
 par-mtrr_aper = -1;
   - par-mtrr_reg = -1;
 if (!nomtrr) {
   - /* Cover the whole resource. */
   - par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
   + par-mtrr_aper = mtrr_add(info-fix.smem_start,
   +   info-fix.smem_len,
   MTRR_TYPE_WRCOMB, 1);
  
  MTRRs need power of two size, so how is this supposed to work?
 
 Still waiting for an answer...

Sorry was in the desert for a bit, I'm back now.

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-27 Thread Andy Lutomirski
On Fri, Mar 27, 2015 at 1:12 PM, Luis R. Rodriguez mcg...@suse.com wrote:
 On Fri, Mar 20, 2015 at 04:52:18PM -0700, Andy Lutomirski wrote:
 On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
 mcg...@do-not-panic.com wrote:
  diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
  b/drivers/video/fbdev/aty/atyfb_base.c
  index 8025624..8875e56 100644
  --- a/drivers/video/fbdev/aty/atyfb_base.c
  +++ b/drivers/video/fbdev/aty/atyfb_base.c
  @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
 
   #ifdef CONFIG_MTRR
  par-mtrr_aper = -1;
  -   par-mtrr_reg = -1;
  if (!nomtrr) {
  -   /* Cover the whole resource. */
  -   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
  +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
  + info-fix.smem_len,
MTRR_TYPE_WRCOMB, 1);
  -   if (par-mtrr_aper = 0  !par-aux_start) {
  -   /* Make a hole for mmio. */
  -   par-mtrr_reg = mtrr_add(par-res_start + 0x80 
  -
  -GUI_RESERVE, GUI_RESERVE,
  -MTRR_TYPE_UNCACHABLE, 1);
  -   if (par-mtrr_reg  0) {
  -   mtrr_del(par-mtrr_aper, 0, 0);
  -   par-mtrr_aper = -1;
  -   }
  -   }
  }
   #endif
 
  @@ -2776,10 +2765,6 @@ aty_init_exit:
  par-pll_ops-set_pll(info, par-saved_pll);
 
   #ifdef CONFIG_MTRR
  -   if (par-mtrr_reg = 0) {
  -   mtrr_del(par-mtrr_reg, 0, 0);
  -   par-mtrr_reg = -1;
  -   }
  if (par-mtrr_aper = 0) {
  mtrr_del(par-mtrr_aper, 0, 0);
  par-mtrr_aper = -1;
  @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, 
  struct fb_info *info,
  }
 
  info-fix.mmio_start = raddr;
  -   par-ati_regbase = ioremap(info-fix.mmio_start, 0x1000);
  +   par-ati_regbase = ioremap_nocache(info-fix.mmio_start, 0x1000);

 Double-check me, but I think that ioremap_nocache + WC MTRR = WC.

 Precicely, in this case the WC hole was obtained by using MTRR WC. This
 patch removes that WC hole trick and now we can be explciit about
 only wanting ioremap_nocache() on the registers, that is WC is not
 desired here and is not used. The patch does not highlight the fact
 that there was left in place another ioremap() call for the framebuffer:

 info-screen_base = ioremap(info-fix.smem_start, info-fix.smem_len);

 That is the one that later after this patch we use ioremap_wc() for.
 This patch just removes the hole solution. That's all.


I don't understand.

If I read it right, there's a 2^n byte BAR.  You're requesting WC for
the whole think using arch_phys_wc_add.  On a PAT system that has no
effect and all is well.  On a non-PAT system, it adds an MTRR.  That
means that you need to override the MTRR somehow for the mmio regs,
and UC- won't do the trick.

Or am I missing something here?

--Andy

   Luis



-- 
Andy Lutomirski
AMA Capital Management, LLC

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-27 Thread Andy Lutomirski
On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote:
 On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
 On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
  On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com 
  wrote:
   On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
   On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
b/drivers/video/fbdev/aty/atyfb_base.c
index 8025624..8875e56 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
   
 #ifdef CONFIG_MTRR
par-mtrr_aper = -1;
-   par-mtrr_reg = -1;
if (!nomtrr) {
-   /* Cover the whole resource. */
-   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
+   par-mtrr_aper = mtrr_add(info-fix.smem_start,
+ info-fix.smem_len,
  MTRR_TYPE_WRCOMB, 1);
  
   MTRRs need power of two size, so how is this supposed to work?
  
   As per mtrr_add_page() [0] the base and size are just supposed to be in 
   units
   of 4 KiB, although the practice is to use powers of 2 in *some* drivers 
   this
   is not standardized and by no means recorded as a requirement. Obviously
   powers of 2 will work too and you'd end up neatly aligned as well. 
   mtrr_add()
   will use mtrr_check() to verify the the same requirement. Furthermore,
   as per my commit log message:
 
  Whatever the code may or may not do, the x86 architecture uses
  power-of-two MTRR sizes.  So I'm confused.

 There should be no confusion, I simply did not know that *was* the
 requirement for x86, if that is the case we should add a check for that
 and perhaps generalize a helper that does the power of two helper changes,
 the cleanest I found was the vesafb driver solution.

 Thoughts?

 The vesafb solution is bad since you'll only end up covering only
 the first 4MB of the framebuffer instead of the almost 8MB you want.
 Which in practice will mean throwing away half the VRAM since you really
 don't want the massive performance hit from accessing it as UC. And that
 would mean giving up decent display resolutions as well :(

 And the other option of trying to cover the remainder with multiple ever
 smaller MTRRs doesn't work either since you'll run out of MTRRs very
 quickly.

 This is precisely why I used the hole method in atyfb in the first
 place.

 I don't really like the idea of any new mtrr code not supporting that
 use case, especially as these things tend to be present in older machines
 where PAT isn't an option.

According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
an effective memory type of UC.  Hence my suggestion to add
ioremap_x86_uc and/or set_memory_x86_uc to punch a UC hole in an
otherwise WC MTRR-covered region.

ioremap_nocache is UC- (even on non-PAT unless I misunderstood how
this stuff works), so ioremap_nocache by itself isn't good enough.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-27 Thread Luis R. Rodriguez
On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote:
 On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä syrj...@sci.fi wrote:
  On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
  On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
   On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez mcg...@suse.com 
   wrote:
On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
 diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
 b/drivers/video/fbdev/aty/atyfb_base.c
 index 8025624..8875e56 100644
 --- a/drivers/video/fbdev/aty/atyfb_base.c
 +++ b/drivers/video/fbdev/aty/atyfb_base.c
 @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)

  #ifdef CONFIG_MTRR
 par-mtrr_aper = -1;
 -   par-mtrr_reg = -1;
 if (!nomtrr) {
 -   /* Cover the whole resource. */
 -   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
 +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
 + info-fix.smem_len,
   MTRR_TYPE_WRCOMB, 1);
   
MTRRs need power of two size, so how is this supposed to work?
   
As per mtrr_add_page() [0] the base and size are just supposed to be 
in units
of 4 KiB, although the practice is to use powers of 2 in *some* 
drivers this
is not standardized and by no means recorded as a requirement. 
Obviously
powers of 2 will work too and you'd end up neatly aligned as well. 
mtrr_add()
will use mtrr_check() to verify the the same requirement. Furthermore,
as per my commit log message:
  
   Whatever the code may or may not do, the x86 architecture uses
   power-of-two MTRR sizes.  So I'm confused.
 
  There should be no confusion, I simply did not know that *was* the
  requirement for x86, if that is the case we should add a check for that
  and perhaps generalize a helper that does the power of two helper changes,
  the cleanest I found was the vesafb driver solution.
 
  Thoughts?
 
  The vesafb solution is bad since you'll only end up covering only
  the first 4MB of the framebuffer instead of the almost 8MB you want.
  Which in practice will mean throwing away half the VRAM since you really
  don't want the massive performance hit from accessing it as UC. And that
  would mean giving up decent display resolutions as well :(
 
  And the other option of trying to cover the remainder with multiple ever
  smaller MTRRs doesn't work either since you'll run out of MTRRs very
  quickly.
 
  This is precisely why I used the hole method in atyfb in the first
  place.
 
  I don't really like the idea of any new mtrr code not supporting that
  use case, especially as these things tend to be present in older machines
  where PAT isn't an option.
 
 According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
 non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
 an effective memory type of UC.  Hence my suggestion to add
 ioremap_x86_uc and/or set_memory_x86_uc to punch a UC hole in an
 otherwise WC MTRR-covered region.

OK I think I get it now.

And I take it this would hopefully only be used for non-PAT systems?
Would there be a use case for PAT systems? I wonder if we can wrap
this under some APIs to make it clean and hide this dirty thing
behind the scenes, it seems a fragile and error prone and my hope
would be that we won't need more specialization in this area for
PAT systems.

 ioremap_nocache is UC- (even on non-PAT unless I misunderstood how
 this stuff works), so ioremap_nocache by itself isn't good enough.

Thanks for the clarification.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-23 Thread Ville Syrjälä
On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
 diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
 b/drivers/video/fbdev/aty/atyfb_base.c
 index 8025624..8875e56 100644
 --- a/drivers/video/fbdev/aty/atyfb_base.c
 +++ b/drivers/video/fbdev/aty/atyfb_base.c
 @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
  
  #ifdef CONFIG_MTRR
   par-mtrr_aper = -1;
 - par-mtrr_reg = -1;
   if (!nomtrr) {
 - /* Cover the whole resource. */
 - par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
 + par-mtrr_aper = mtrr_add(info-fix.smem_start,
 +   info-fix.smem_len,
 MTRR_TYPE_WRCOMB, 1);

MTRRs need power of two size, so how is this supposed to work?

 - if (par-mtrr_aper = 0  !par-aux_start) {
 - /* Make a hole for mmio. */
 - par-mtrr_reg = mtrr_add(par-res_start + 0x80 -
 -  GUI_RESERVE, GUI_RESERVE,
 -  MTRR_TYPE_UNCACHABLE, 1);
 - if (par-mtrr_reg  0) {
 - mtrr_del(par-mtrr_aper, 0, 0);
 - par-mtrr_aper = -1;
 - }
 - }
   }
  #endif
  
 @@ -2776,10 +2765,6 @@ aty_init_exit:
   par-pll_ops-set_pll(info, par-saved_pll);
  
  #ifdef CONFIG_MTRR
 - if (par-mtrr_reg = 0) {
 - mtrr_del(par-mtrr_reg, 0, 0);
 - par-mtrr_reg = -1;
 - }
   if (par-mtrr_aper = 0) {
   mtrr_del(par-mtrr_aper, 0, 0);
   par-mtrr_aper = -1;
 @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, 
 struct fb_info *info,
   }
  
   info-fix.mmio_start = raddr;
 - par-ati_regbase = ioremap(info-fix.mmio_start, 0x1000);
 + par-ati_regbase = ioremap_nocache(info-fix.mmio_start, 0x1000);
   if (par-ati_regbase == NULL)
   return -ENOMEM;
  
 @@ -3491,6 +3476,8 @@ static int atyfb_setup_generic(struct pci_dev *pdev, 
 struct fb_info *info,
   info-fix.smem_start = addr;
   info-fix.smem_len = 0x80;
  
 + aty_fudge_framebuffer_len(info);
 +
   info-screen_base = ioremap(info-fix.smem_start, info-fix.smem_len);
   if (info-screen_base == NULL) {
   ret = -ENOMEM;
 @@ -3563,6 +3550,7 @@ static int atyfb_pci_probe(struct pci_dev *pdev,
   return -ENOMEM;
   }
   par = info-par;
 + par-bus_type = PCI;
   info-fix = atyfb_fix;
   info-device = pdev-dev;
   par-pci_id = pdev-device;
 @@ -3732,10 +3720,6 @@ static void atyfb_remove(struct fb_info *info)
  #endif
  
  #ifdef CONFIG_MTRR
 - if (par-mtrr_reg = 0) {
 - mtrr_del(par-mtrr_reg, 0, 0);
 - par-mtrr_reg = -1;
 - }
   if (par-mtrr_aper = 0) {
   mtrr_del(par-mtrr_aper, 0, 0);
   par-mtrr_aper = -1;
 -- 
 2.3.2.209.gd67f9d5.dirty
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-fbdev in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO hole work around

2015-03-20 Thread Andy Lutomirski
On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
mcg...@do-not-panic.com wrote:
 From: Luis R. Rodriguez mcg...@suse.com

 The atyfb driver uses an MTRR work around since some
 cards use the same PCI BAR for the framebuffer and MMIO.
 In such cards the last page is used for MMIO, the rest for
 the framebuffer, so on those cards we ioremap() the MMIO
 page alone, then again ioremap() the full framebuffer
 including the MMIO space *and* ___then___ use an MTRR with
 MTRR_TYPE_WRCOMB on the full PCI BAR... and finally hole
 in an MTRR_TYPE_UNCACHABLE MTRR only for MMIO.

 This is a terrible fucking work around, and should by no means
 be necessary however evidence through a large series of conversion
 of drivers to ioremap_wc() for the framebuffer shows that around
 the time MTRR started becoming popular devices did not have things
 lined up for easily separating the framebuffer and MMIO register
 access. In some cases a driver requires significant intrusive
 changes in order to make the split for an ioremap() for MMIO registers
 and another ioremap_wc() for the framebuffer, at other times a
 bit of careful study of the driver suffices. This example driver
 falls into the later category.

 We can replace the MTRR MTRR_TYPE_UNCACHABLE
 work around by using ioremap_nocache(), the length of the
 MMIO space should already be correct. The other part we
 need to correct is ensuring we ioremap() for the framebuffer
 only the required size. Since the ioremap() happens early
 on probe for PCI devices before aty_init() where we typically
 adjust the length and know how to do it, we can fix this by
 pegging the bus type as PCI on PCI probe, and finally fudging
 and framebuffer length just as we do on aty_init().

 The last thing we do must do to remain sane is ensure we
 use the info-fix.smem_start and info-fix.smem_len for
 the framebuffer MTRR as we know that is always well adjusted.
 The *one* concern here would be if the MTRR is not in units
 of 4K __but__ we already know that in the PCI case this cannot
 happen, in the shared space setting the MTRR would be up to
 0x7ff000 and assuming a 4K page:

 ; 0x7ff000 / 0x1000
 2047

 Also, internally when MTRR is used mtrr_add() will use mtrr_check()
 and that should splat a warning when the MTRR base and size are
 not compatible with what is expected for MTRR usage.

 This fix lets us nuke the MTRR_TYPE_UNCACHABLE MTRR hole.

 Cc: Suresh Siddha suresh.b.sid...@intel.com
 Cc: Venkatesh Pallipadi venkatesh.pallip...@intel.com
 Cc: Ingo Molnar mi...@elte.hu
 Cc: Linus Torvalds torva...@linux-foundation.org
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Juergen Gross jgr...@suse.com
 Cc: Daniel Vetter daniel.vet...@ffwll.ch
 Cc: Andy Lutomirski l...@amacapital.net
 Cc: Dave Airlie airl...@redhat.com
 Cc: Antonino Daplas adap...@gmail.com
 Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com
 Cc: Tomi Valkeinen tomi.valkei...@ti.com
 Cc: linux-fb...@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org
 Signed-off-by: Luis R. Rodriguez mcg...@suse.com
 ---
  drivers/video/fbdev/aty/atyfb.h  |  1 -
  drivers/video/fbdev/aty/atyfb_base.c | 28 ++--
  2 files changed, 6 insertions(+), 23 deletions(-)

 diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h
 index 1f39a62..89ec439 100644
 --- a/drivers/video/fbdev/aty/atyfb.h
 +++ b/drivers/video/fbdev/aty/atyfb.h
 @@ -184,7 +184,6 @@ struct atyfb_par {
 spinlock_t int_lock;
  #ifdef CONFIG_MTRR
 int mtrr_aper;
 -   int mtrr_reg;
  #endif
 u32 mem_cntl;
 struct crtc saved_crtc;
 diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
 b/drivers/video/fbdev/aty/atyfb_base.c
 index 8025624..8875e56 100644
 --- a/drivers/video/fbdev/aty/atyfb_base.c
 +++ b/drivers/video/fbdev/aty/atyfb_base.c
 @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)

  #ifdef CONFIG_MTRR
 par-mtrr_aper = -1;
 -   par-mtrr_reg = -1;
 if (!nomtrr) {
 -   /* Cover the whole resource. */
 -   par-mtrr_aper = mtrr_add(par-res_start, par-res_size,
 +   par-mtrr_aper = mtrr_add(info-fix.smem_start,
 + info-fix.smem_len,
   MTRR_TYPE_WRCOMB, 1);
 -   if (par-mtrr_aper = 0  !par-aux_start) {
 -   /* Make a hole for mmio. */
 -   par-mtrr_reg = mtrr_add(par-res_start + 0x80 -
 -GUI_RESERVE, GUI_RESERVE,
 -MTRR_TYPE_UNCACHABLE, 1);
 -   if (par-mtrr_reg  0) {
 -   mtrr_del(par-mtrr_aper, 0, 0);
 -   par-mtrr_aper = -1;
 -   }
 -   }
 }
  #endif

 @@ -2776,10 +2765,6 @@ aty_init_exit:
 par-pll_ops-set_pll(info, par-saved_pll);

  #ifdef CONFIG_MTRR
 -